1414# "accidentally" fixing the write imbalance (after CPU-based rebalancing
1515# has created a count-imbalance).
1616#
17- # Expected outcome: MMA should rebalance both CPU and write load across all
18- # stores by identifying and moving the hot ranges, regardless of the cold
19- # range distribution. SMA can deliberately balance CPU, but has no visibility
20- # into write load and so can't equalize on that dimension. Because CPU and
21- # writes are coupled (via raft processing), SMA ends up thrashing — repeatedly
22- # balancing CPU only to have write redistribution destabilize it again. This
23- # is reflected in the high thrash percentages for sma-count (~2300% CPU,
24- # ~5500% writes) vs mma-only (~150% CPU, ~240% writes).
17+ # Expected outcome:
18+ #
19+ # MMA balances both CPU and writes intentionally, reaching a stable equilibrium
20+ # where both metrics are evenly distributed. It moves the hot ranges (both read
21+ # and write) to achieve this, ignoring the cold ranges.
22+ #
23+ # SMA (sma-count) only optimizes for CPU and has no visibility into write load.
24+ # It balances CPU but leaves writes heavily skewed on s4-s6. The high thrash
25+ # (~1500% CPU) comes from the store rebalancer (optimizing CPU) and the
26+ # replicate/lease queues (optimizing counts) working against each other.
2527gen_cluster nodes=6 node_cpu_cores=5
2628----
2729
@@ -36,9 +38,10 @@ gen_load rate=1000 rw_ratio=1.0 request_cpu_per_access=5000000 min_key=1 max_key
36385.00 access-vcpus
3739
3840# Cold ranges co-located with the hot read ranges on s1-s3.
39- # These have no load but contribute to range count, making count-based
40- # rebalancing see s1-s3 as having "too many" ranges even though the load
41- # problem is actually the opposite (s1-s3 need to shed CPU load, not ranges).
41+ #
42+ # These raise the absolute threshold for count-based rebalancing, preventing
43+ # small movements from triggering fixes that might accidentally redistribute
44+ # the hot ranges and fix IO imbalance.
4245gen_ranges ranges=100 min_key=20001 max_key=30000 placement_type=replica_placement
4346{s1,s2,s3}:1
4447----
@@ -50,13 +53,14 @@ gen_ranges ranges=36 min_key=10001 max_key=20000 placement_type=replica_placemen
5053----
5154{s4:*,s5,s6}:1
5255
56+ # raft_cpu_per_write=1 makes writes contribute negligible CPU. This isolates the
57+ # two dimensions: s1-s3 are CPU-hot (reads), and s4-s6 are write-hot but
58+ # CPU-cold.
5359gen_load rate=20000 rw_ratio=0 min_block=1000 max_block=1000 raft_cpu_per_write=1 min_key=10001 max_key=20000
5460----
55610.00 raft-vcpus, 19 MiB/s goodput
5662
57- # Cold ranges co-located with the hot write ranges on s4-s6.
58- # Same principle: these pad the range count on s4-s6, obscuring the real
59- # problem (write concentration) from count-based rebalancing.
63+ # Cold ranges co-located with the hot write ranges on s4-s6 (same principle).
6064gen_ranges ranges=100 min_key=30001 max_key=40000 placement_type=replica_placement
6165{s4,s5,s6}:1
6266----
0 commit comments