Skip to content

Commit 583ca66

Browse files
committed
allocatorimpl: improve comments for recent changes
This commit updates comments to reflect changes made in previous commits.
1 parent e9348e8 commit 583ca66

File tree

4 files changed

+87
-44
lines changed

4 files changed

+87
-44
lines changed

pkg/kv/kvserver/allocator/allocatorimpl/allocator.go

Lines changed: 42 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1873,24 +1873,44 @@ func (a Allocator) RebalanceTarget(
18731873
var target, existingCandidate *candidate
18741874
var removeReplica roachpb.ReplicationTarget
18751875

1876-
// When bestRebalanceTarget selects an option (bestIdx) for the first time, it
1877-
// builds and caches the corresponding MMA advisor in advisors[bestIdx]. The
1878-
// advisor stores the means and other information MMA needs to determine if a
1879-
// candidate conflicts with its goals for this option. After this,
1880-
// bestRebalanceTarget is free to mutate the cands set of the // option
1881-
// incrementally. If bestRebalanceTarget selects the same option again, it
1882-
// will reuse the cached advisor to call into IsInConflictWithMMA.
1876+
// The loop below can iterate multiple times. This is because the
1877+
// (source,target) pair chosen by bestRebalanceTarget may be rejected either
1878+
// by the multi-metric allocator or by the check that a moved replica wouldn't
1879+
// immediately be removable. bestRebalanceTarget mutates the candidate slice
1880+
// for the given option (=source) to exclude each considered candidate. For
1881+
// example, the loop may proceed as follows:
1882+
// - initially, we may consider rebalancing from s1 to either of s2, s3, or
1883+
// s5, or rebalancing from s6 to either of s2 or s4: options = [s1 ->
1884+
// [s2,s3,s5], s6 -> [s2,s4]]. Each option here is considered as an
1885+
// equivalence class.
1886+
// - bestRebalanceTarget might pick s6->s4, and removes this choice from
1887+
// `options`. options now becomes [s1->[s2,s3,s5], s6 -> [s2]].
1888+
// - mma might reject s6->s4, so we loop around.
1889+
// - next, s1->s3 might be chosen, but fail the removable replica check.
1890+
// - so we'll begin a third loop with: options is now [s1->[s2,s5], s6 ->
1891+
// [s2]].
1892+
// - s6->s2 might be chosen and might succeed, terminating the loop and
1893+
// proceeding to make the change.
18831894
//
1884-
// For example, the option may start with {s1, s2, s3} as its candidate set,
1885-
// if bestRebalanceTarget selects s1, it will remove s1 and continue with {s2,
1886-
// s3}, then say it continues to select s2 and remove s2 and continue with
1887-
// {s3}, and finally {s3} → {}. Each call to MMA should use the original set
1888-
// {s1, s2, s3} ∪ {existing} to compute the means. The design here allows MMA
1889-
// to compute the means for the original set {s1, s2, s3} ∪ {existing} once
1890-
// and then use it for all calls to MMA, and bestRebalanceTarget does not need
1891-
// to copy the candidate set.
1895+
// Note that in general (and in the example) a source store can be considered
1896+
// multiple times (s6 is considered twice), so we cache the corresponding MMA
1897+
// advisor to avoid potentially expensive O(store) recomputations. The
1898+
// corresponding advisor is constructed only once and cached in
1899+
// results[bestIdx].advisor when the the source store is selected as the best
1900+
// rebalance target for the first time. After that, bestRebalanceTarget is
1901+
// free to mutate the cands set of the option. However, MMARebalancerAdvisor
1902+
// should use the original candidate set union the existing store to compute
1903+
// the load summary when calling IsInConflictWithMMA. It does so by using the
1904+
// computed meansLoad summary cached when this option was selected as the best
1905+
// rebalance target for the first time.
18921906
var bestIdx int
18931907

1908+
// NB: bestRebalanceTarget may modify the candidate set (cands) within each
1909+
// option in results. However, for each option, the associated source store,
1910+
// MMARebalanceAdvisor, and their index in results must remain unchanged
1911+
// throughout the process. This ensures that any cached MMARebalanceAdvisor
1912+
// continues to correspond to the original candidate set and source store,
1913+
// even as candidates are removed.
18941914
for {
18951915
target, existingCandidate, bestIdx = bestRebalanceTarget(a.randGen, results, a.as)
18961916
if target == nil {
@@ -1905,11 +1925,11 @@ func (a Allocator) RebalanceTarget(
19051925
// improvements.
19061926
if !existingCandidate.isCriticalRebalance(target) {
19071927
// If the rebalance is not critical, we check if it conflicts with mma's
1908-
// goal. advisor for the target should always be registered in
1909-
// bestRebalanceTarget and present in the map. If mma rejects the
1910-
// rebalance, we will continue to the next target. Note that this target
1911-
// would have been deleted from the candidates set in bestRebalanceTarget,
1912-
// so we will not select it again.
1928+
// goal. advisor for bestIdx should always be cached by
1929+
// bestRebalanceTarget. If mma rejects the rebalance, we will continue to
1930+
// the next target. Note that bestRebalanceTarget would delete this target
1931+
// from the candidates set when being selected, so this target will not be
1932+
// selected again.
19131933
if advisor := results[bestIdx].advisor; advisor != nil {
19141934
if a.as.IsInConflictWithMMA(ctx, target.store.StoreID, advisor, false) {
19151935
continue
@@ -2803,7 +2823,8 @@ func (a *Allocator) CountBasedRebalancingDisabled() bool {
28032823
// change. This is used to prevent thrashing when both multi-metric and
28042824
// count-based rebalancing are enabled and have conflicting goals.
28052825
// TODO(wenyihu6): since we sometimes see even worse thrashing behaviour with
2806-
// this change, should we introduce another mode for this
2826+
// this change, should we introduce two modes
2827+
// (mma-count with+without thrashing prevention)?
28072828
func (a *Allocator) CountBasedRebalancingOnlyEnabledByMMA() bool {
28082829
return kvserverbase.LoadBasedRebalancingMode.Get(&a.st.SV) == kvserverbase.LBRebalancingMultiMetricAndCount
28092830
}

pkg/kv/kvserver/allocator/allocatorimpl/allocator_scorer.go

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1420,7 +1420,11 @@ func candidateListForRemoval(
14201420
type rebalanceOptions struct {
14211421
existing candidate
14221422
candidates candidateList
1423-
advisor *mmaprototype.MMARebalanceAdvisor
1423+
// advisor is lazily initialized by bestRebalanceTarget when this option is
1424+
// selected as best rebalance target. It is used to determine if a candidate
1425+
// is in conflict with mma's goals when LBRebalancingMultiMetricAndCount mode
1426+
// is enabled.
1427+
advisor *mmaprototype.MMARebalanceAdvisor
14241428
}
14251429

14261430
// equivalenceClass captures the set of "equivalent" replacement candidates

pkg/kv/kvserver/allocator/mmaprototype/allocator_state.go

Lines changed: 26 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1297,13 +1297,23 @@ func (a *allocatorState) ensureAnalyzedConstraints(rstate *rangeState) bool {
12971297

12981298
// MMARebalanceAdvisor contains information that mma needs to determine if a
12991299
// candidate is in conflict with its goals. All fields should be immutable after
1300-
// creation.
1300+
// its initialization.
1301+
//
1302+
// MMARebalanceAdvisor uses the meansLoad summary to compute the load summary
1303+
// for a provided candidate. Then it compares the candidate load summary against
1304+
// the existingStoreSLS to determine if the candidate is more overloaded than
1305+
// the existing store. If yes, mma will return true for IsInConflictWithMMA. It
1306+
// is up to the caller to decide what to do with this information.
13011307
type MMARebalanceAdvisor struct {
13021308
// disabled is true when MMA is disabled. It overrides all decisions with
13031309
// IsInConflictWithMMA returning false.
1304-
disabled bool
1310+
disabled bool
1311+
// existingStoreID is the ID of the existing store.
13051312
existingStoreID roachpb.StoreID
1306-
// existingStoreSLS is the load summary for the existing store.
1313+
// existingStoreSLS holds the load summary for the existing store. It is
1314+
// initially nil and is computed using existingStoreID and means the first
1315+
// time IsInConflictWithMMA is called. The caller must ensure this advisor is
1316+
// only used with the corresponding existingStoreID.
13071317
existingStoreSLS *storeLoadSummary
13081318
// means is the means for the candidate set.
13091319
means meansLoad
@@ -1318,11 +1328,13 @@ func NoopMMARebalanceAdvisor() *MMARebalanceAdvisor {
13181328
}
13191329
}
13201330

1321-
// BuildMMARebalanceAdvisor creates a MMARebalanceAdvisor for the given existing
1322-
// store and candidates. The advisor is returned here used to determine if a
1323-
// given candidate is in conflict with the existing store via
1324-
// IsInConflictWithMMA. The candidate set here may or may not include the
1325-
// existing store. mma should include the existing store in the candidate set.
1331+
// BuildMMARebalanceAdvisor constructs an MMARebalanceAdvisor for the given
1332+
// existing store and candidate stores. The advisor can be used to determine if
1333+
// a candidate is in conflict with the existing store via IsInConflictWithMMA.
1334+
// The provided cands list may or may not include the existing store. This
1335+
// method always adds the existing store to the cands list so that it is
1336+
// included in the mean calculation. It is up to computeMeansForStoreSet to
1337+
// handle the de-duplication of storeIDs from the cands list.
13261338
func (a *allocatorState) BuildMMARebalanceAdvisor(
13271339
existing roachpb.StoreID, cands []roachpb.StoreID,
13281340
) *MMARebalanceAdvisor {
@@ -1339,9 +1351,11 @@ func (a *allocatorState) BuildMMARebalanceAdvisor(
13391351
}
13401352

13411353
// IsInConflictWithMMA determines if the given candidate is in conflict with the
1342-
// existing store using the provided MMARebalanceAdvisor. Caller is responsible
1343-
// for making sure the MMARebalanceAdvisor is for the correct existing store and
1344-
// candidate set.
1354+
// existing store using the provided MMARebalanceAdvisor. For simplicity, we
1355+
// currently say that this is in conflict if the candidate is more overloaded
1356+
// than the existing store. This is subject to change in the future. Caller is
1357+
// responsible for making sure the MMARebalanceAdvisor is for the correct
1358+
// existing store and candidate set.
13451359
func (a *allocatorState) IsInConflictWithMMA(
13461360
ctx context.Context, cand roachpb.StoreID, advisor *MMARebalanceAdvisor, cpuOnly bool,
13471361
) bool {
@@ -1363,7 +1377,7 @@ func (a *allocatorState) IsInConflictWithMMA(
13631377
if conflict {
13641378
log.KvDistribution.VEventf(
13651379
ctx, 2,
1366-
"mma rejected candidate s%d (cpu-only) as a replacement for s%d: candidate=%v > existing=%v",
1380+
"mma rejected candidate s%d(cpu-only) as a replacement for s%d: candidate=%v > existing=%v",
13671381
cand, advisor.existingStoreID, candSLS.dimSummary[CPURate], existingSLS.dimSummary[CPURate],
13681382
)
13691383
}

pkg/kv/kvserver/mmaintegration/thrashing.go

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -42,22 +42,23 @@ import (
4242
// Two main design questions came up.
4343
// 1. The first was when to exclude overloaded stores (early before mean
4444
// calculation or late only at target selection).
45-
// - We decided to include them in the mean calculation but exclude them at the
45+
// We decided to include them in the mean calculation but exclude them at the
4646
// final target selection step. This minimizes code churn, avoids plumbing new
4747
// fields into candidate structs, and reduces number of mma calls by checking
4848
// only the final target instead of on every candidate. It does not eliminate
4949
// thrashing, since a store may look like a good candidate during scoring but be
5050
// rejected later, picking a not-so-good but still better than existing
5151
// candidate. The lease queue follows the same rule, filtering overloaded stores
5252
// only at final target selection.
53-
// - Alternatives considered: 1. mma participates in the allocator's scoring
53+
// Alternatives considered: 1. mma participates in the allocator's scoring
5454
// options either by jittering balance score or by introducing a new field in
5555
// the candidate struct. 2. exclude the store right before or right after the
5656
// equivalence class construction.
5757
//
58-
// 2. The second question was which set of stores to use when computing load
59-
// summaries with respect to.
60-
// - The principle we followed is to use the same set of stores that is used to
58+
// 2. The second question is: when MMA computes a store’s load summary, it
59+
// requires a set of stores as a basis. The question is, which set of stores
60+
// should be used?
61+
// • The principle we followed is to use the same set of stores that is used to
6162
// compute the mean for range or lease count. For the replicate queue, this
6263
// means we use all stores that satisfy constraints to compute mean. The
6364
// principle we are following here is that we want this set or the mean to be
@@ -66,7 +67,9 @@ import (
6667
// constraint-satisfying stores. For the lease queue, this means we use all
6768
// stores that satisfy the constraint to compute the lease count mean as well.
6869
// This approach differs from how mma computes load summary for lease transfers
69-
// - mma computes load summary over stores that the existing replicas are on.
70+
// (mma computes load summary over stores that the existing replicas are on).
71+
// • Note that MMARebalanceAdvisor also always include the existing store in the
72+
// set of stores to compute the load summary with respect to.
7073
//
7174
// Alternatives considered:
7275
// 1. Another option was to let mma choose from a set of candidates, but this was
@@ -88,10 +91,11 @@ import (
8891
// for IsInConflictWithMMA. If MMA is enabled, the advisor is created by
8992
// computing the load summary for the provided existing store and candidate set.
9093
//
91-
// Note that MMA continues to use this candidate set to compute load summaries,
92-
// so it is safe for the caller to modify the candidate set after calling this
93-
// function. The caller is responsible for keeping track of the returned advisor
94-
// and associating it with the corresponding existing store.
94+
// Note that MMARebalanceAdvisor should always use the means summary constructed
95+
// during BuildMMARebalanceAdvisor to compute the load summary for the provided
96+
// candidate in IsInConflictWithMMA. The caller may modify its candidate set
97+
// after calling this function, so the caller is responsible for keeping track
98+
// of the returned advisor and associating it.
9599
func (as *AllocatorSync) BuildMMARebalanceAdvisor(
96100
existing roachpb.StoreID, cands []roachpb.StoreID,
97101
) *mmaprototype.MMARebalanceAdvisor {

0 commit comments

Comments
 (0)