Skip to content

Commit 44fe79f

Browse files
committed
kvserver: track priority inversion in replicate queue metrics
Previously, replicas could be enqueued at a high priority but end up processing a lower-priority actions, causing priority inversion and unfairness to other replicas behind them that needs a repair action. This commit adds metrics to track such cases. In addition, this commit also adds metrics to track when replicas are requeued in the replicate queue due to a priority inversion from a repair action to a rebalance action.
1 parent f328e00 commit 44fe79f

File tree

2 files changed

+310
-0
lines changed

2 files changed

+310
-0
lines changed

docs/generated/metrics/metrics.yaml

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13951,6 +13951,134 @@ layers:
1395113951
unit: COUNT
1395213952
aggregation: AVG
1395313953
derivative: NONE
13954+
- name: queue.replicate.priority_inversion.addnonvoter
13955+
exported_name: queue_replicate_priority_inversion_addnonvoter
13956+
description: Number of priority inversions in the replicate queue that resulted in add non-voter action during processing
13957+
y_axis_label: Replicas
13958+
type: COUNTER
13959+
unit: COUNT
13960+
aggregation: AVG
13961+
derivative: NON_NEGATIVE_DERIVATIVE
13962+
- name: queue.replicate.priority_inversion.addvoter
13963+
exported_name: queue_replicate_priority_inversion_addvoter
13964+
description: Number of priority inversions in the replicate queue that resulted in add voter action during processing
13965+
y_axis_label: Replicas
13966+
type: COUNTER
13967+
unit: COUNT
13968+
aggregation: AVG
13969+
derivative: NON_NEGATIVE_DERIVATIVE
13970+
- name: queue.replicate.priority_inversion.considerrebalance
13971+
exported_name: queue_replicate_priority_inversion_considerrebalance
13972+
description: Number of priority inversions in the replicate queue that resulted in consider rebalance action during processing
13973+
y_axis_label: Replicas
13974+
type: COUNTER
13975+
unit: COUNT
13976+
aggregation: AVG
13977+
derivative: NON_NEGATIVE_DERIVATIVE
13978+
- name: queue.replicate.priority_inversion.noop
13979+
exported_name: queue_replicate_priority_inversion_noop
13980+
description: Number of priority inversions in the replicate queue that resulted in noop action during processing
13981+
y_axis_label: Replicas
13982+
type: COUNTER
13983+
unit: COUNT
13984+
aggregation: AVG
13985+
derivative: NON_NEGATIVE_DERIVATIVE
13986+
- name: queue.replicate.priority_inversion.rangeunavailable
13987+
exported_name: queue_replicate_priority_inversion_rangeunavailable
13988+
description: Number of priority inversions in the replicate queue that resulted in range unavailable action during processing
13989+
y_axis_label: Replicas
13990+
type: COUNTER
13991+
unit: COUNT
13992+
aggregation: AVG
13993+
derivative: NON_NEGATIVE_DERIVATIVE
13994+
- name: queue.replicate.priority_inversion.removedeadnonvoter
13995+
exported_name: queue_replicate_priority_inversion_removedeadnonvoter
13996+
description: Number of priority inversions in the replicate queue that resulted in remove dead non-voter action during processing
13997+
y_axis_label: Replicas
13998+
type: COUNTER
13999+
unit: COUNT
14000+
aggregation: AVG
14001+
derivative: NON_NEGATIVE_DERIVATIVE
14002+
- name: queue.replicate.priority_inversion.removedeadvoter
14003+
exported_name: queue_replicate_priority_inversion_removedeadvoter
14004+
description: Number of priority inversions in the replicate queue that resulted in remove dead voter action during processing
14005+
y_axis_label: Replicas
14006+
type: COUNTER
14007+
unit: COUNT
14008+
aggregation: AVG
14009+
derivative: NON_NEGATIVE_DERIVATIVE
14010+
- name: queue.replicate.priority_inversion.removedecommissioningnonvoter
14011+
exported_name: queue_replicate_priority_inversion_removedecommissioningnonvoter
14012+
description: Number of priority inversions in the replicate queue that resulted in remove decommissioning non-voter action during processing
14013+
y_axis_label: Replicas
14014+
type: COUNTER
14015+
unit: COUNT
14016+
aggregation: AVG
14017+
derivative: NON_NEGATIVE_DERIVATIVE
14018+
- name: queue.replicate.priority_inversion.removedecommissioningvoter
14019+
exported_name: queue_replicate_priority_inversion_removedecommissioningvoter
14020+
description: Number of priority inversions in the replicate queue that resulted in remove decommissioning voter action during processing
14021+
y_axis_label: Replicas
14022+
type: COUNTER
14023+
unit: COUNT
14024+
aggregation: AVG
14025+
derivative: NON_NEGATIVE_DERIVATIVE
14026+
- name: queue.replicate.priority_inversion.removenonvoter
14027+
exported_name: queue_replicate_priority_inversion_removenonvoter
14028+
description: Number of priority inversions in the replicate queue that resulted in remove non-voter action during processing
14029+
y_axis_label: Replicas
14030+
type: COUNTER
14031+
unit: COUNT
14032+
aggregation: AVG
14033+
derivative: NON_NEGATIVE_DERIVATIVE
14034+
- name: queue.replicate.priority_inversion.removevoter
14035+
exported_name: queue_replicate_priority_inversion_removevoter
14036+
description: Number of priority inversions in the replicate queue that resulted in remove voter action during processing
14037+
y_axis_label: Replicas
14038+
type: COUNTER
14039+
unit: COUNT
14040+
aggregation: AVG
14041+
derivative: NON_NEGATIVE_DERIVATIVE
14042+
- name: queue.replicate.priority_inversion.replacedeadnonvoter
14043+
exported_name: queue_replicate_priority_inversion_replacedeadnonvoter
14044+
description: Number of priority inversions in the replicate queue that resulted in replace dead non-voter action during processing
14045+
y_axis_label: Replicas
14046+
type: COUNTER
14047+
unit: COUNT
14048+
aggregation: AVG
14049+
derivative: NON_NEGATIVE_DERIVATIVE
14050+
- name: queue.replicate.priority_inversion.replacedecommissioningnonvoter
14051+
exported_name: queue_replicate_priority_inversion_replacedecommissioningnonvoter
14052+
description: Number of priority inversions in the replicate queue that resulted in replace decommissioning non-voter action during processing
14053+
y_axis_label: Replicas
14054+
type: COUNTER
14055+
unit: COUNT
14056+
aggregation: AVG
14057+
derivative: NON_NEGATIVE_DERIVATIVE
14058+
- name: queue.replicate.priority_inversion.replacedecommissioningvoter
14059+
exported_name: queue_replicate_priority_inversion_replacedecommissioningvoter
14060+
description: Number of priority inversions in the replicate queue that resulted in replace decommissioning voter action during processing
14061+
y_axis_label: Replicas
14062+
type: COUNTER
14063+
unit: COUNT
14064+
aggregation: AVG
14065+
derivative: NON_NEGATIVE_DERIVATIVE
14066+
- name: queue.replicate.priority_inversion.requeue
14067+
exported_name: queue_replicate_priority_inversion_requeue
14068+
description: Number of priority inversions in the replicate queue that resulted in requeuing of the replicas. A priority inversion occurs when the priority at processing time ends up being lower than at enqueue time. When the priority has changed from a high priority repair action to rebalance, the change is requeued to avoid unfairness.
14069+
y_axis_label: Replicas
14070+
type: COUNTER
14071+
unit: COUNT
14072+
aggregation: AVG
14073+
derivative: NON_NEGATIVE_DERIVATIVE
14074+
- name: queue.replicate.priority_inversion.total
14075+
exported_name: queue_replicate_priority_inversion_total
14076+
description: Total number of priority inversions in the replicate queue. A priority inversion occurs when the priority at processing time ends up being lower than at enqueue time
14077+
y_axis_label: Replicas
14078+
type: COUNTER
14079+
unit: COUNT
14080+
aggregation: AVG
14081+
derivative: NON_NEGATIVE_DERIVATIVE
1395414082
- name: queue.replicate.process.failure
1395514083
exported_name: queue_replicate_process_failure
1395614084
description: Number of replicas which failed processing in the replicate queue

0 commit comments

Comments
 (0)