Skip to content

Conversation

@wenyihu6
Copy link
Contributor

@wenyihu6 wenyihu6 commented Jan 5, 2026

Part of #156776
Epic: CRDB-55052


storepool: export GetStoreStatuses for mma health signal plumbing

Previously, storepool's StoreStatus was not exported and only used internally by
storepool. This commit exports StoreStatus so that mma integration can consume
the same store-level signals that SMA relies and translate them into mma's
(Health, Disposition) model.

An alternative approach would be to have store pool handle the translation of
the store status to mma's (Health, Disposition) model. But this would couple
storepool to mma and blur ownership; storepool should remain as the source of
truth for store status. MMA owns the interpretation.


mmaintegration: add StoreStatus to mma status translation plumbing

This commit adds the translation layer from StoreStatus to mma's (Health,
Disposition) model.

sma currently relies on StorePool methods (GetStoreList, LiveAndDeadReplicas)
which internally compute StoreStatus using: NodeLivenessFunc (membership +
health) combined with other signals (throttling, suspect state) to determine
StoreStatus.

To preserve sma's behavior, mma reuses StorePool's status() method and
translates it to its own (Health, Disposition) model rather than re-deriving
health independently.

Alternatives considered (and rejected):

  1. Query NodeLiveness directly in mma: NodeLiveness operates at the node level,
    while StorePool tracks per-store state so store status on the same node can
    diverge based on gossip timing and store specific signals. In addition,
    NodeLiveness does not include other store signals such as throttling
    (snapshot backpressure) and suspect status (recently unavailable) which are
    currently used by sma to filter candidates when making lease/replica
    placement decisions.

  2. Periodically poll storepool from mma Statuses are plumbed before
    ComputeChanges() instead of periodically in another goroutine.It is more
    complex, may be stale and less efficient. mma currently only needs updated
    health statuses for ComputeChanges.

Note that the translation goes through allocator sync, not directly in
mmaprototype, to avoid importing storepool there and keep layering clean.

The translation follows this mapping:

StorePool Status MMA Health Lease Disposition Replica Disposition Rationale
Dead HealthDead Shedding Shedding Store is gone: shed everything
Unknown HealthUnknown Refusing Refusing State is unknown: don't add but don't remove either
Decommissioning HealthOK Shedding Shedding Store is leaving cluster: shed everything
Draining HealthOK Shedding Refusing Store is draining: shed leases, accept replicas
Throttled HealthOK OK Refusing Healthy but overlpaded: accept leases but not replicas
Suspect HealthUnhealthy Shedding Refusing Recently unavailable: shed leases for safety and don't accept replicas
Available HealthOK OK OK Healthy store: accept all

asim: make store rebalancer refresh store status

This commit updates asis's mma store rebalancer to refresh store status before
calling ComputeChanges(), matching production behavior.

Previously, storepool's StoreStatus was not exported and only used internally by
storepool. This commit exports StoreStatus so that mma integration can consume
the same store-level signals that SMA relies and translate them into mma's
(Health, Disposition) model.

An alternative approach would be to have store pool handle the translation of
the store status to mma's (Health, Disposition) model. But this would couple
storepool to mma and blur ownership; storepool should remain as the source of
truth for store status. MMA owns the interpretation.
@wenyihu6 wenyihu6 changed the title storepool: export GetStoreStatuses for mma health signal plumbing mmaintegration: add StoreStatus to mma status translation plumbing Jan 5, 2026
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@wenyihu6
Copy link
Contributor Author

wenyihu6 commented Jan 5, 2026

@tbg still needs unit test but putting this up to check on the high-level design

@wenyihu6 wenyihu6 requested a review from tbg January 5, 2026 14:55
This commit adds the translation layer from StoreStatus to mma's (Health,
Disposition) model.

sma currently relies on StorePool methods (GetStoreList, LiveAndDeadReplicas)
which internally compute StoreStatus using: NodeLivenessFunc (membership +
health) combined with other signals (throttling, suspect state) to determine
StoreStatus.

To preserve sma's behavior, mma reuses StorePool's status() method and
translates it to its own (Health, Disposition) model rather than re-deriving
health independently.

Alternatives considered (and rejected):
1. Query NodeLiveness directly in mma: NodeLiveness operates at the node level,
   while StorePool tracks per-store state so store status on the same node can
   diverge based on gossip timing and store specific signals. In addition,
   NodeLiveness does not include other store signals such as throttling
   (snapshot backpressure) and suspect status (recently unavailable) which are
   currently used by sma to filter candidates when making lease/replica
   placement decisions.

2. Periodically poll storepool from mma Statuses are plumbed before
   ComputeChanges() instead of periodically in another goroutine.It is more
   complex, may be stale and less efficient. mma currently only needs updated
   health statuses for ComputeChanges.

Note that the translation goes through allocator sync, not directly in
mmaprototype, to avoid importing storepool there and keep layering clean.

The translation follows this mapping:
| StorePool Status | MMA Health      | Lease Disposition | Replica Disposition | Rationale                                                             |
|------------------|-----------------|-------------------|---------------------|-----------------------------------------------------------------------|
| Dead             | HealthDead      | Shedding          | Shedding            | Store is gone: shed everything                                        |
| Unknown          | HealthUnknown   | Refusing          | Refusing            | State is unknown: don't add but don't remove either                   |
| Decommissioning  | HealthOK        | Shedding          | Shedding            | Store is leaving cluster: shed everything                             |
| Draining         | HealthOK        | Shedding          | Refusing            | Store is draining: shed leases, accept replicas                       |
| Throttled        | HealthOK        | OK                | Refusing            | Healthy but overlpaded: accept leases but not replicas                |
| Suspect          | HealthUnhealthy | Shedding          | Refusing            | Recently unavailable: shed leases for safety and don't accept replicas|
| Available        | HealthOK        | OK                | OK                  | Healthy store: accept all                                             |
This commit updates asis's mma store rebalancer to refresh store status before
calling ComputeChanges(), matching production behavior.
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@tbg reviewed 12 files and all commit messages, and made 7 comments.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @wenyihu6).


-- commits line 52 at r2:
The rationale is inconsistent with the replica disposition. I'm guessing the Disposition is correct? When a store is draining, it shouldn't accept replicas. Worth thinking through the case in which a store that is the only remaining target for upreplication is draining: it still doesn't make sense to place a replica there I suppose.


-- commits line 53 at r2:
Worth double checking, but my memory is that "throttled" is when a store refuses a snapshot, right? Or is this triggered for other reasons as well? The "Healthy but overloaded" description doesn't really capture this. Let's call out specifically what "throttled" means here and also in the actual code where the translation happens.


-- commits line 54 at r2:
Isn't shedding too aggressive? Remind me how a store gets into "suspect" state according to the store pool?


pkg/kv/kvserver/mmaintegration/store_status.go line 14 at r2 (raw file):

// translateStorePoolStatusToMMA translates a StorePool status to MMA's (health,
// disposition) model.

Reminder to update this table with the one in the PR description (I'm okay with removing the one from the PR and just keeping this one as the source of truth)


pkg/kv/kvserver/mmaintegration/store_status.go line 70 at r2 (raw file):

		)
	default:
		// Unknown status - treat as unavailable.

Panic, at least in crdb test build?


pkg/kv/kvserver/allocator/storepool/store_pool.go line 899 at r2 (raw file):

}

// GetStoreStatus returns the store status for the given store ID.

Is this used?

@wenyihu6
Copy link
Contributor Author

wenyihu6 commented Jan 6, 2026

Going to split this pr up - first one here #160555.

@wenyihu6
Copy link
Contributor Author

wenyihu6 commented Jan 7, 2026

TFTR! I will reply these in the new PR #160623.

@wenyihu6
Copy link
Contributor Author

wenyihu6 commented Jan 7, 2026

Closing in favor of #160623.

@wenyihu6 wenyihu6 closed this Jan 7, 2026
@wenyihu6 wenyihu6 deleted the newhealth branch January 8, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants