You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
150249: mmaprototype: port over prototype improvements r=tbg a=wenyihu6
**mmaprototype: improve multi-dimensional rebalancing logic**
- Improved commentary on sortTargetCandidateSetAndPick.
- Most significant is the reworking of the logic in
clusterState.canShedAndAddLoad, to be more aggressive when all stores
are overloaded along different dimensions. The logic now is arguably
more principled than before: in addition to ensuring that the
overloaded dimension is not becoming worse in the target than the
source (existing logic) it removes the aggregate summary logic that
was stopping some rebalancing. Instead it looks at the individual
resource dimensions (other than the overloaded dimension), and checks
that the fraction increase in those dimensions in the target is
significantly smaller than the fraction increase in the overloaded
dimension. This should prevent thrashing wrt the same range being
moved back to the source.
Some test result changes:
mma_one_voter_skewed_cpu_skewed_write now ends with no store in an
overload state:
[n1s1,t1h28m23s,mmaid=258] 77721 evaluating s1: node load loadNoChange, store load loadNoChange, worst dim CPURate
[n1s1,t1h28m23s,mmaid=258] 77722 evaluating s2: node load loadNormal, store load loadNormal, worst dim CPURate
mma_skewed_cpu_skewed_write_more_ranges converges much faster, even over
the original 60m duration of the simulation (I've increased the duration
to 90m to make it fully converge).
mma_skewed_cpu_skewed_write: Two nodes are overloadSlow along
WriteBandwidth. They can't shed to s2, s5, s6 since those will also
become overloaded along WriteBandwidth while the src will become
underloaded. They don't attempt to shed to s1 since s1 is loadNoChange
along CPU, so is in a later equivalence class based on aggregate load.
This may be a deficiency of sortTargetCandidateSetAndPick.
[n6s6,t59m59.5s,mmaid=452] 59570 evaluating s2: node load loadNormal, store load loadNormal, worst dim CPURate
[n6s6,t59m59.5s,mmaid=452] 59571 evaluating s3: node load loadNormal, store load overloadSlow, worst dim WriteBandwidth
[n6s6,t59m59.5s,mmaid=452] 59573 evaluating s4: node load loadNormal, store load overloadSlow, worst dim WriteBandwidth
[n6s6,t59m59.5s,mmaid=452] 59575 evaluating s5: node load loadNormal, store load loadNormal, worst dim CPURate
[n6s6,t59m59.5s,mmaid=452] 59576 evaluating s6: node load loadLow, store load loadNormal, worst dim WriteBandwidth
[n6s6,t59m59.5s,mmaid=452] 59577 evaluating s1: node load loadNoChange, store load loadNoChange, worst dim CPURate
Epic: none
Release note: None
---
**mmaprototype: canShedAndAddLoad must not make target overloadUrgent**
Epic: none
Release note: none
---
**mmaprototype: reduce minWriteBandwidthGranularity to 128KiB**
Epic: none
Release note: none
---
**mmaprototype: when ignoreHigherThanLoadThreshold is set, extend beyond first equivalence class in sortTargetCandidateSetAndPick**
This behavior is needed to handle cases where the first equivalence class
has no candidatest that can accept the load, but later ones can, because
they have lower load in the overloadedDim.
Epic: none
Release note: none
---
**mmaprototype: fix bug in sortTargetCandidateSetAndPick**
The intention of the code was to be structured around sets representing
equivalence classes, such that a later set is only considered if none
of the earlier sets had a member that was discarded and had pending
changes. Prior to this change, this criteria was arbitrarily applied in
the middle of a set. So if two stores {s1, s2} were in the same set and
s1 was discarded and had pending changes we would not consider s2.
Now we will include s2 and stop when the next set starts.
Epic: none
Release note: none
Co-authored-by: sumeerbhola <[email protected]>
0 commit comments