Skip to content

Commit af3267c

Browse files
committed
mmaprototype: comment on the lifecycle of pending changes
1 parent 49cbc69 commit af3267c

File tree

1 file changed

+63
-2
lines changed

1 file changed

+63
-2
lines changed

pkg/kv/kvserver/allocator/mmaprototype/cluster_state.go

Lines changed: 63 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -861,7 +861,62 @@ type rangeState struct {
861861
// what is pending also allows for undo in the case of explicit failure,
862862
// notified by AdjustPendingChangesDisposition.
863863
//
864-
// 2. Modeling
864+
// 2. Lifecycle
865+
// pendingChanges track proposed modifications to a range's replicas or
866+
// leaseholder that are not yet reflected in the leaseholder's authoritative
867+
// state. They are created by three sources: range rebalances, lease transfers
868+
// originating from MMA, or external changes via RegisterExternalChanges
869+
// (replicate or lease queue). There exists a pending change in a range state
870+
// iff there is also a corresponding one in clusterState's pendingChanges.
871+
//
872+
// A pending change is removed from tracking in one of three ways:
873+
// 1. Marked as enacted successfully: remove the pending changes. The adjusted
874+
// load remains until processStoreLoadMsg determines the change is reflected
875+
// in the latest store load message, based on whether
876+
// lagForChangeReflectedInLoad has elapsed since enactment.
877+
//
878+
// This happens when:
879+
// - The pending change is successfully applied via
880+
// AdjustPendingChangesDisposition(success).
881+
// - The pending change is considered subsumed based on the leaseholder msg.
882+
// - The leaseholder of the range has changed. This is a special case where
883+
// the leaseholder of the range has moved to a different store, and the
884+
// rangeMsg no longer contains the range. We assume that the pending change
885+
// has been enacted in this case.
886+
//
887+
// 2. Undone as failed: corresponding replica and load change is rolled back.
888+
// Note that for replica changes that originate from one action, all changes
889+
// would be undone together.
890+
// NB: pending changes of a range state originate from one decision.
891+
// Therefore, when one pending change is enacted successfully, we mark this
892+
// range state's pending changes as no rollback (read more about this in 3).
893+
// If we are here trying to undo a pending change but the range state has
894+
// already been marked as no rollback, we do not undo the remaining pending
895+
// changes. Instead, we wait for a StoreLeaseholderMsg to discard the pending
896+
// changes and revert the load adjustments after the
897+
// partiallyEnactedGCDuration has elapsed since the first enacted change. The
898+
// modeling here is imperfect (read more about this in 3).
899+
//
900+
// This happens when:
901+
// - The pending change failed to apply via
902+
// AdjustPendingChangesDisposition(failed)).
903+
// - The pending change is garbage collected after this pending change has
904+
// been created for pendingChangeGCDuration.
905+
//
906+
// 3. Dropped due to incompatibility: mma creates these pending changes while
907+
// working with an earlier authoritative leaseholder message. These changes
908+
// remain valid until a new authoritative message arrives that may reflect a
909+
// conflicting state. See preCheckOnApplyReplicaChanges for details on how
910+
// compatibility between the pending change and the new range state is
911+
// determined. When incompatibility is detected, the pending replica change is
912+
// discarded and the corresponding load adjustments are rolled back.
913+
//
914+
// This happens when:
915+
// - processStoreLeaseholderMsgInternal tries to apply the pending changes to
916+
// the received range state from the new leaseholder msg, but the pending
917+
// changes are incompatible with the new range state.
918+
//
919+
// 3. Modeling
865920
//
866921
// The slice of pendingChanges represent one decision. However, this
867922
// decision is not always executed atomically by the external system.
@@ -910,7 +965,7 @@ type rangeState struct {
910965
// added. This is unavoidable and will be fixed by the first
911966
// StoreLeaseholderMsg post-GC.
912967
//
913-
// 3. Non Atomicity Hazard
968+
// 4. Non Atomicity Hazard
914969
//
915970
// Since a decision is represented with multiple pending changes, and we
916971
// allow for individual changes to be considered enacted or failed, we have
@@ -1613,6 +1668,12 @@ func (cs *clusterState) gcPendingChanges(now time.Time) {
16131668
if !ok {
16141669
panic(errors.AssertionFailedf("range %v not found in cluster state", rangeID))
16151670
}
1671+
1672+
// Unlike normal GC that reverts changes, we want to discard these pending
1673+
// changes. Do nothing here; processStoreLeaseholderMsgInternal will later
1674+
// detect and discard these pending changes. Note that
1675+
// processStoreLeaseholderMsgInternal will not revert the pending load
1676+
// change.
16161677
if rs.pendingChangeNoRollback {
16171678
continue
16181679
}

0 commit comments

Comments
 (0)