-
Notifications
You must be signed in to change notification settings - Fork 631
Single state machine across main and object stores #1077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
remove deprecated white box test
libs/storage/Tsavorite/cs/src/core/Index/Checkpointing/StateMachineDriver.cs
Show resolved
Hide resolved
libs/storage/Tsavorite/cs/src/core/Index/Checkpointing/StateMachineDriver.cs
Show resolved
Hide resolved
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/TsavoriteThread.cs
Outdated
Show resolved
Hide resolved
TedHartMS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved with one comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a single state machine shared between the main and object stores. Key changes include unifying checkpoint markers and processing (with fuzzy buffering for (v+1) entries), refactoring transaction version registration across stores, and updating state machine driver initialization and usage throughout the codebase.
Reviewed Changes
Copilot reviewed 63 out of 63 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| libs/server/AOF/AofProcessor.cs | Refactored AOF processing methods to use unified checkpoint markers and include fuzzy region buffering. |
| libs/cluster/Server/Replication/ReplicationManager.cs | Updated checkpoint version shift functions to use separate start/end callbacks and removed deprecated properties. |
| libs/server/Servers/GarnetServerOptions.cs | Added Epoch and StateMachineDriver fields with corresponding initialization methods. |
| libs/server/StoreWrapper.cs | Unified the TakeCheckpoint APIs by removing store type parameters and adjusted checkpoint task logic. |
| libs/cluster/Server/ClusterProvider.cs | Modified checkpoint initiation logic for replicas to use the checkpoint start marker offset. |
| libs/server/Lua/LuaRunner.cs | Updated transaction handling with acquisition, verification, and proper ending of transaction versions. |
| libs/server/Transaction/TransactionManager.cs | Integrated stateMachineDriver for transaction version management and updated lock acquisition/release functions. |
| libs/server/AOF/AofEntryType.cs | Renamed and deprecated old checkpoint markers to support the unified checkpointing mechanism. |
| libs/server/AOF/AofHeader.cs | Increased the AOF header version from 1 to 2 to reflect changes in format. |
| libs/cluster/Server/Replication/ReplicationLogCheckpointManager.cs | Adjusted callbacks to support separate start and end checkpoint shift notifications. |
| libs/cluster/Server/Replication/ReplicaOps/ReplicaReplayTask.cs | Revised replica replay logic to update replication checkpoint offsets on encountering the start marker. |
| libs/storage/Tsavorite/cs/src/core/Allocator/IStreamingSnapshotIteratorFunctions.cs | Renamed parameter from targetVersion to nextVersion for clarity. |
| libs/storage/Tsavorite/cs/src/core/ClientSession/ClientSession.cs | Simplified lock acquisition/release functions and streamlined transaction version updates. |
| libs/cluster/Server/Replication/PrimaryOps/DisklessReplication/ReplicationSnapshotIterator.cs | Removed redundant commit marker enqueue in stop-processing logic. |
| libs/cluster/Server/ClusterConfig.cs | Added a logger parameter to the HandleConfigEpochCollision method for additional diagnostic output. |
| libs/host/GarnetServer.cs | Called initialization on server options and updated object store settings invocation accordingly. |
| libs/server/Resp/RespServerSession.cs | Passed the state machine driver into the TransactionManager constructor for consistent transaction version handling. |
| libs/server/Storage/Session/StorageSession.cs | Exposed the StateMachineDriver from server options to the storage session. |
| libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorBase.cs | Removed tracking of active locking sessions as part of the refactoring for transaction version handling. |
| libs/server/Resp/AdminCommands.cs | Updated checkpoint command calls to match the new TakeCheckpoint API signature. |
Comments suppressed due to low confidence (2)
libs/server/AOF/AofHeader.cs:15
- Increasing the AofHeaderVersion to 2 may impact backward compatibility. Ensure that old AOF formats are either upgraded or handled appropriately during replay.
const byte AofHeaderVersion = 2;
libs/server/StoreWrapper.cs:852
- [nitpick] Consider using the logical AND operator (&&) instead of the bitwise AND (&) for clarity if short-circuit behavior is desired.
tryIncremental &= objectStore.CanTakeIncrementalCheckpoint(checkpointType, out var guid2) & checkpointResult.token == guid2;
check that main and object stores recovered to the same version
Changes:
(v) -> (v+1)switchover. We ensure correctness by verifying the transaction version after all locks are acquired.NumActiveLockingSessions(renamed toNumActiveTransactions) to SMD and ensure correct(v) -> (v+1)switchover with active transactions.targetVersionconcept so that checkpoint versions always increment in steps of 1.CheckpointVersionShiftto a pair ofCheckpointVersionShiftStartandCheckpointVersionShiftEndto help demarcate start and end of fuzzy region in the AOF ((region that contains mix ofvandv+1log entries).checkpoint_startmarker: TrackReplicationCheckpointStartOffsetin AOF(v)ops and txns in the fuzzy region of AOF, store away the(v+1)entries in a fuzzy buffercheckpoint_endmarker: Take a checkpoint, potentially truncating untilReplicationCheckpointStartOffset, replay all the stored(v+1)ops and txns from the fuzzy buffer.Notes: