Draft local planning baseline for M9-T08 / issue #85.
This document narrows the next implementation slice before code work starts. It is not a new authoritative API surface; it is the execution plan for bringing the accepted revoke/reclaim semantics into the current trusted-core implementation.
M9-T07 established fencing and stale-holder rejection. The next kernel step is to withdraw
holder authority explicitly without permitting early reuse.
The implementation question for M9-T08 is narrower than the full lease-model transition:
- how revoke enters the current core cleanly
- how reclaim becomes the only point where reuse is allowed
- how to keep the reservation-era compatibility surface from drifting away from the accepted lease-centric semantics
Implement the minimum revoke/reclaim behavior needed to preserve the late-not-early reuse rule in
the current execution path, including the already-approved crash/retry/failover safety contract for
M9-T08.
For this slice, "done" means:
- the core can log and apply
revokeandreclaim - stale holders lose authority as soon as revoke commits
- resources remain unavailable until reclaim commits
- exact retries stay deterministic
M9-T08 should include:
- one explicit revoke command in the trusted core
- one explicit reclaim command in the trusted core
- one live non-terminal state for revoked-but-not-yet-reusable ownership
- one terminal revoked outcome that preserves history after reclaim
- the minimum executor, persistence, and replay plumbing needed so committed revoke/reclaim outcomes survive live apply, restart, and the existing failover contract
- invariant, negative-path, retry, and crash-recovery tests for the new safety rule
M9-T08 should not expand into:
- new replication protocol design, failover refactors, or replicated-surface expansion beyond what is needed to preserve committed revoke/reclaim outcomes under the existing path
- WAL or snapshot reshaping beyond the exact command/state support required for revoke/reclaim
- broader public API and transport cleanup beyond the narrow compatibility bridge already required by this slice
- heartbeat ingestion or wall-clock reclaim logic inside the state machine
- policy reasons or operator metadata attached to revoke/reclaim
- holder transfer or shared-resource semantics
Those belong to later slices, primarily M9-T09 through M9-T11.
The accepted model is lease-centric, but the current implementation is still reservation-centric in spelling and data layout.
For M9-T08, that bridge is allowed under one rule:
- reservation-era names may remain temporarily, but revoke/reclaim behavior must match the authoritative lease semantics exactly
That means:
- the current
reservation_idmay continue to serve as the implementation anchor forlease_id - the existing
confirmedstate may continue as the compatibility spelling for authoritativeactive - the slice must not introduce reservation-era shortcuts that would be invalid in the final lease model
Implementation intent:
revoke(lease_id)
Current compatibility spelling may still route this through the reservation-era implementation, but the effect must be:
- precondition: live lease exists and is currently
active - success: lease moves to
revoking - success:
lease_epochincrements immediately - success: member resources stay unavailable and keep pointing at the same live owner
- success: resource state becomes
revoking - success: no retirement is scheduled yet
Failure behavior:
lease_not_foundif the lease never existedlease_retiredif retained history says the live record is already goneinvalid_stateforreserved,revoking,released,expired, orrevoked
Duplicate behavior:
- exact retry with the same
operation_idmust return the cached original result - a later distinct revoke with a different
operation_idmust not invent a second success; once a lease is alreadyrevokingor terminal, the answer isinvalid_state
Implementation intent:
reclaim(lease_id)
Effect:
- precondition: live lease exists and is currently
revoking - success: lease moves to terminal
revoked - success: member resources return to
available - success: per-resource current owner pointers clear
- success: retirement is scheduled through the normal bounded history path
Failure behavior:
lease_not_foundlease_retiredinvalid_stateforreserved,active,released,expired, or alreadyrevoked
Duplicate behavior:
- exact retry with the same
operation_idmust return the cached original result - a later distinct reclaim on an already terminal record must not produce a second success
M9-T08 must preserve these invariants:
- revoke removes holder authority before reuse is possible
- reclaim is the only transition that makes a revoked resource reusable
- active or revoking leases are never freed by timer
- late external reclaim is acceptable; early reclaim is not
- holder-authorized commands that arrive after revoke with the old epoch fail deterministically
- replay of committed revoke/reclaim commands yields the same resource availability outcome
The slice should be built in this order:
- add core state and command variants for revoke/reclaim
- apply revoke/reclaim through the same executor path already used by reserve/confirm/release
- add only the exact codec, snapshot, and recovery support required so live apply and replay preserve committed revoke/reclaim outcomes
- preserve the current no-early-reuse contract under crash, retry, and failover without broadening the replication surface in this slice
- add resource-state and lease-state invariants for
revokingandrevoked - add retry and stale-holder regression coverage
Important boundary:
- if broader WAL/snapshot cleanup, transport normalization, or replication-surface redesign becomes
necessary, keep only the revoke/reclaim unblocker here and defer the broader cleanup to
M9-T09andM9-T10
Minimum test set:
- revoke on active lease moves the lease to
revokingand bumpslease_epoch - revoke does not free member resources
- holder
releaseorconfirmwith the old epoch after revoke fails deterministically - reclaim from
revokingreturns resources toavailableand records terminalrevokedhistory - reclaim before revoke is
invalid_state - exact duplicate revoke and reclaim requests return cached committed results
- reserved, active, and revoking resources cannot be reused early
- crash/restart replay preserves
revokingvsrevokedoutcomes - committed revoke/reclaim outcomes preserve the same no-early-reuse behavior across the current failover path
M9-T08 is ready to hand off when:
- the exact revoke/reclaim behavior above is implemented or explicitly mapped to narrower code tasks
docs/status.mdpoints at#85instead of stale#84language- the slice still satisfies the existing
M9-T08crash/retry/failover acceptance criteria without silently expanding into broaderM9-T09orM9-T10cleanup - later work is cleanly reserved for
M9-T09throughM9-T11