You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
153220: kvserverbase: add DisableCountBasedRebalancingIfMMAEnabled r=tbg a=wenyihu6
Resolves: #152579
Release note: none
---
**kvserverbase: add DisableCountBasedRebalancingIfMMAEnabled**
This commit introduces a new cluster setting,
`DisableCountBasedRebalancingIfMMAEnabled`, which allows us to disable range-count
and lease-count based convergence rebalancing when mma is enabled. These forms
of rebalancing can conflict with mma goals and cause thrashing.
---
**allocator: add option to disable lease-count based rebalancing**
This commit disables lease-count based rebalancing in the lease queue.
Currently, the lease queue uses `ShouldTransferLease` and `PlanOneChange` to select
a target store for transferring a range lease. Candidate stores are chosen based
on store health, lease preferences, lagging replicas, and I/O load, and the
final target is decided using one of: `FollowTheWorkload`, `LeaseCountConvergence`,
or `LoadConvergence`.
`LeaseCountConvergence` is used in two scenarios:
(1) When `FollowTheWorkload` (`shouldTransferLeaseForAccessLocality`) lacks sufficient
stats or locality info.
(2) `maybeTransferLeaseAwayTarget`: calls into `allocator.TransferLeaseTarget` with
`LeaseCountConvergence` directly.
This commit disables case (1) by returning early in
`shouldTransferLeaseForLeaseCountConvergence` (used in
`Allocator.TransferLeaseTarget` and `allocator.ShouldTransferLease`). Case (2) is
left intact, since it is less common and less likely to interfere with MMA
decisions.
---
**kvserver: add `BaseScorerOptions`**
This commit introduces `BaseScorerOptions`, a base struct embedded by other
`ScorerOptions` implementations. It provides the default implementation methods,
while specific implementations can override individual methods to apply their
own scoring heuristics.
---
**allocator: add `adjustRangeCountForScoring`**
This commit adds `adjustRangeCountForScoring` to the `ScorerOptions` interface. The
method takes a provided range count, adjusts it according to the scorer’s
heuristic, and returns the result. It is similar to other scoring metrics such
as the convergence score, and will later be used in candidate.compare as an
attribute for breaking ties. The compare method evaluates the relative
difference between two stores and favors the one with fewer ranges as a better
rebalance target. Currently and by default, `adjustRangeCountForScoring` simply
returns the provided range count. Future scorer options will override it with 0
since the heuristics is to not consider range count at all.
---
**allocator: add `BaseScorerOptionsNoConvergence`**
This commit disables range-count based rebalancing.
Currently, when selecting a rebalance target, the allocator constructs an
equivalence class for each existing replica. This class includes all stores that
are not worse than the existing replica in terms of constraint conformance,
diversity score, and disk fullness. At this stage, range count convergence is
not yet considered. The allocator then evaluates each candidate set using
`ScorerOptions`, which populates candidates' attributes such as balance score,
convergence score, and range count with respect to the equivalence class, and
determines the best candidate within each equivalence class.
To disable range count–based rebalancing, this commit introduces a new scorer
option, `BaseScorerOptionsNoConvergence`. This scorer assigns the same score to
all stores, effectively treating candidates within the same equivalence class
the same. It is only used when both MMA and
`DisableCountBasedRebalancingIfMMAEnabled` are enabled.
Note that this new scorer option is applied only in
`ReplicaPlanner.considerRebalance` and `ReplicaPlanner.ShouldPlanChange`. As a
result, range-count based rebalancing is disabled during rebalancing, but range
count is still considered when allocator adds new replicas (such as
`AllocateVoter`), choosing a rebalance target in the old store rebalancer
(`StoreRebalancer.applyRangeRebalance`), and when removing replicas
(`Allocator.RemoveVoter`).
---
**kvserver: add LBRebalancingMultiMetricAndCount**
This commit removes the cluster setting
DisableCountBasedRebalancingIfMMAEnabled, renames LBRebalancingMultiMetric to
LBRebalancingMultiMetricOnly, and introduces a new mode:
LBRebalancingMultiMetricAndCount. To keep the datadriven test output unchanged,
all previous references to mma in asim now use LBRebalancingMultiMetricAndCount,
which matches the prior behavior. I’ll follow up with a new asim mode in future
commits.
Co-authored-by: wenyihu6 <[email protected]>
Copy file name to clipboardExpand all lines: docs/generated/settings/settings.html
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -86,7 +86,7 @@
86
86
<tr><td><divid="setting-jobs-retention-time" class="anchored"><code>jobs.retention_time</code></div></td><td>duration</td><td><code>336h0m0s</code></td><td>the amount of time for which records for completed jobs are retained</td><td>Basic/Standard/Advanced/Self-Hosted</td></tr>
87
87
<tr><td><divid="setting-kv-allocator-lease-rebalance-threshold" class="anchored"><code>kv.allocator.lease_rebalance_threshold</code></div></td><td>float</td><td><code>0.05</code></td><td>minimum fraction away from the mean a store's lease count can be before it is considered for lease-transfers</td><td>Advanced/Self-Hosted</td></tr>
88
88
<tr><td><divid="setting-kv-allocator-load-based-lease-rebalancing-enabled" class="anchored"><code>kv.allocator.load_based_lease_rebalancing.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>set to enable rebalancing of range leases based on load and latency</td><td>Advanced/Self-Hosted</td></tr>
89
-
<tr><td><divid="setting-kv-allocator-load-based-rebalancing" class="anchored"><code>kv.allocator.load_based_rebalancing</code></div></td><td>enumeration</td><td><code>leases and replicas</code></td><td>whether to rebalance based on the distribution of load across stores [off = 0, leases = 1, leases and replicas = 2, multi-metric = 3]</td><td>Advanced/Self-Hosted</td></tr>
89
+
<tr><td><divid="setting-kv-allocator-load-based-rebalancing" class="anchored"><code>kv.allocator.load_based_rebalancing</code></div></td><td>enumeration</td><td><code>leases and replicas</code></td><td>whether to rebalance based on the distribution of load across stores [off = 0, leases = 1, leases and replicas = 2, multi-metric = 3, multi-metric and count = 4]</td><td>Advanced/Self-Hosted</td></tr>
90
90
<tr><td><divid="setting-kv-allocator-load-based-rebalancing-objective" class="anchored"><code>kv.allocator.load_based_rebalancing.objective</code></div></td><td>enumeration</td><td><code>cpu</code></td><td>what objective does the cluster use to rebalance; if set to `qps` the cluster will attempt to balance qps among stores, if set to `cpu` the cluster will attempt to balance cpu usage among stores [qps = 0, cpu = 1]</td><td>Advanced/Self-Hosted</td></tr>
91
91
<tr><td><divid="setting-kv-allocator-load-based-rebalancing-interval" class="anchored"><code>kv.allocator.load_based_rebalancing_interval</code></div></td><td>duration</td><td><code>1m0s</code></td><td>the rough interval at which each store will check for load-based lease / replica rebalancing opportunities</td><td>Advanced/Self-Hosted</td></tr>
92
92
<tr><td><divid="setting-kv-allocator-qps-rebalance-threshold" class="anchored"><code>kv.allocator.qps_rebalance_threshold</code></div></td><td>float</td><td><code>0.1</code></td><td>minimum fraction away from the mean a store's QPS (such as queries per second) can be before it is considered overfull or underfull</td><td>Advanced/Self-Hosted</td></tr>
0 commit comments