Skip to content

Commit b7eeaee

Browse files
committed
kvserver: track priority inversion in replicate queue metrics
Previously, replicas could be enqueued at a high priority but end up processing a lower-priority actions, causing priority inversion and unfairness to other replicas behind them that needs a repair action. This commit adds metrics to track such cases. In addition, this commit also adds metrics to track when replicas are requeued in the replicate queue due to a priority inversion from a repair action to a rebalance action.
1 parent c301eba commit b7eeaee

File tree

2 files changed

+198
-0
lines changed

2 files changed

+198
-0
lines changed

docs/generated/metrics/metrics.html

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -442,6 +442,22 @@
442442
<tr><td>STORAGE</td><td>queue.replicate.enqueue.unexpectederror</td><td>Number of replicas that were expected to be enqueued (ShouldQueue returned true or the caller decided to add to the replicate queue directly), but failed to be enqueued due to unexpected errors</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
443443
<tr><td>STORAGE</td><td>queue.replicate.nonvoterpromotions</td><td>Number of non-voters promoted to voters by the replicate queue</td><td>Promotions of Non Voters to Voters</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
444444
<tr><td>STORAGE</td><td>queue.replicate.pending</td><td>Number of pending replicas in the replicate queue</td><td>Replicas</td><td>GAUGE</td><td>COUNT</td><td>AVG</td><td>NONE</td></tr>
445+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.addnonvoter</td><td>Number of priority inversions in the replicate queue that resulted in add non-voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
446+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.addvoter</td><td>Number of priority inversions in the replicate queue that resulted in add voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
447+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.considerrebalance</td><td>Number of priority inversions in the replicate queue that resulted in consider rebalance action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
448+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.noop</td><td>Number of priority inversions in the replicate queue that resulted in noop action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
449+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.rangeunavailable</td><td>Number of priority inversions in the replicate queue that resulted in range unavailable action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
450+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.removedeadnonvoter</td><td>Number of priority inversions in the replicate queue that resulted in remove dead non-voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
451+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.removedeadvoter</td><td>Number of priority inversions in the replicate queue that resulted in remove dead voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
452+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.removedecommissioningnonvoter</td><td>Number of priority inversions in the replicate queue that resulted in remove decommissioning non-voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
453+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.removedecommissioningvoter</td><td>Number of priority inversions in the replicate queue that resulted in remove decommissioning voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
454+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.removenonvoter</td><td>Number of priority inversions in the replicate queue that resulted in remove non-voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
455+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.removevoter</td><td>Number of priority inversions in the replicate queue that resulted in remove voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
456+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.replacedeadnonvoter</td><td>Number of priority inversions in the replicate queue that resulted in replace dead non-voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
457+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.replacedecommissioningnonvoter</td><td>Number of priority inversions in the replicate queue that resulted in replace decommissioning non-voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
458+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.replacedecommissioningvoter</td><td>Number of priority inversions in the replicate queue that resulted in replace decommissioning voter action during processing</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
459+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.requeue</td><td>Number of priority inversions in the replicate queue that resulted in requeuing of the replicas. A priority inversion occurs when the priority at processing time ends up being lower than at enqueue time. When the priority has changed from a high priority repair action to rebalance, the change is requeued to avoid unfairness.</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
460+
<tr><td>STORAGE</td><td>queue.replicate.priority_inversion.total</td><td>Total number of priority inversions in the replicate queue. A priority inversion occurs when the priority at processing time ends up being lower than at enqueue time</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
445461
<tr><td>STORAGE</td><td>queue.replicate.process.failure</td><td>Number of replicas which failed processing in the replicate queue</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
446462
<tr><td>STORAGE</td><td>queue.replicate.process.success</td><td>Number of replicas successfully processed by the replicate queue</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
447463
<tr><td>STORAGE</td><td>queue.replicate.processingnanos</td><td>Nanoseconds spent processing replicas in the replicate queue</td><td>Processing Time</td><td>COUNTER</td><td>NANOSECONDS</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>

0 commit comments

Comments
 (0)