mmaintegration: add StoreStatus to mma status translation plumbing #160623

wenyihu6 · 2026-01-07T15:17:35Z

mmaintegration: add StoreStatus to mma status translation plumbing

This commit adds the translation layer from StoreStatus to mma's (Health,
Disposition) model.

sma currently relies on StorePool methods (GetStoreList, LiveAndDeadReplicas)
which internally compute StoreStatus using: NodeLivenessFunc (membership +
health) combined with other signals (throttling, suspect state) to determine
StoreStatus.

To preserve sma's behavior, mma reuses StorePool's status() method and
translates it to its own (Health, Disposition) model rather than re-deriving
health independently.

Alternatives considered (and rejected):

Query NodeLiveness directly in mma: NodeLiveness operates at the node level,
while StorePool tracks per-store state so store status on the same node can
diverge based on gossip timing and store specific signals. In addition,
NodeLiveness does not include other store signals such as throttling
(snapshot backpressure) and suspect status (recently unavailable) which are
currently used by sma to filter candidates when making lease/replica
placement decisions.
Periodically poll storepool from mma Statuses are plumbed before
ComputeChanges() instead of periodically in another goroutine.It is more
complex, may be stale and less efficient. mma currently only needs updated
health statuses for ComputeChanges.

Note that the translation goes through allocator sync, not directly in
mmaprototype, to avoid importing storepool there and keep layering clean.

asim: make store rebalancer refresh store status

This commit updates asis's mma store rebalancer to refresh store status before
calling ComputeChanges(), matching production behavior.

mmaprototype: rename from UpdateStoreStatus to UpdateStoresStatuses

cockroach-teamcity · 2026-01-07T15:17:54Z

This change is

github-actions · 2026-01-07T15:51:04Z

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

sumeerbhola

@sumeerbhola reviewed 11 files and all commit messages, and made 8 comments.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg and @wenyihu6).

pkg/kv/kvserver/asim/tests/testdata/non_rand/mma/store_status_shedding.txt line 39 at r1 (raw file):

# Assert s4 (dead) should have 0 replicas.
assertion type=stat stat=replicas ticks=6 exact_bound=0 stores=(4)

is this assertion evaluated after the 6m eval below?

pkg/kv/kvserver/asim/tests/testdata/non_rand/mma/store_status_shedding.txt line 48 at r1 (raw file):

leases#1: thrash_pct: [s1=0%, s2=0%, s3=0%, s4=0%]  (sum=0%)
replicas#1: first: [s1=40, s2=40, s3=40, s4=0] (stddev=17.32, mean=30.00, sum=120)
replicas#1: last:  [s1=35, s2=40, s3=40, s4=5] (stddev=14.58, mean=30.00, sum=120)

why does s4 have 5 replicas?

never mind: I see this is fixed by the second commit

pkg/kv/kvserver/allocator/mmaprototype/allocator.go line 65 at r1 (raw file):

	// UpdateStoreStatus updates the health and disposition for the stores in storeStatuses according to the statuses in storeStatuses.
	// Stores not known to the allocator are ignored with logging.
	// TODO(wenyihu6): if this is too expensive, we should only update status for stores that have changed.

why would this be expensive given it is just setting some struct fields?

Regarding expense, I suspect we'll get very far with removing all the memory allocations we have scattered all over the MMA code (or integration code, like the map allocation in StorePool.GetStoreStatuses), some with todos, and many without.

pkg/kv/kvserver/allocator/mmaprototype/allocator.go line 66 at r1 (raw file):

	// Stores not known to the allocator are ignored with logging.
	// TODO(wenyihu6): if this is too expensive, we should only update status for stores that have changed.
	UpdateStoreStatus(ctx context.Context, storeStatuses map[roachpb.StoreID]Status)

nit: plural would be better, UpdateStoresStatus

pkg/kv/kvserver/mmaintegration/store_status.go line 31 at r1 (raw file):

     |------------------|-----------------|-------------------|---------------------|------------------------------------------------------------------------------------------------|
     | Dead             | HealthDead      | Shedding          | Shedding            | Store is gone: shed everything   (matches SMA)
     | Decommissioning  | HealthOK        | Shedding          | Shedding            | Store is leaving cluster: shed everything   (SMA allows leases, MMA more aggressive)

So when does SMA starts shedding leases for decommissioning nodes/stores?

pkg/kv/kvserver/mmaintegration/store_status.go line 32 at r1 (raw file):

     | Dead             | HealthDead      | Shedding          | Shedding            | Store is gone: shed everything   (matches SMA)
     | Decommissioning  | HealthOK        | Shedding          | Shedding            | Store is leaving cluster: shed everything   (SMA allows leases, MMA more aggressive)
     | Unknown          | HealthUnknown   | Refusing          | Refusing            | State is unknown: don't add but don't remove either   (SMA sheds leases, MMA more conservative)

Should MMA also be shedding leases?

pkg/kv/kvserver/mmaintegration/store_status.go line 62 at r1 (raw file):

			mmaprototype.ReplicaDispositionShedding,
		)
	case storepool.StoreStatusUnknown:

nit: the ordering of cases doesn't match the one in the table above.

wenyihu6

@wenyihu6 made 6 comments and resolved 1 discussion.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola and @tbg).

pkg/kv/kvserver/allocator/mmaprototype/allocator.go line 65 at r1 (raw file):