-
Notifications
You must be signed in to change notification settings - Fork 159
feat(sync): decouple code syncer from state syncer and run concurrently #1140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Replace sequential syncer execution with concurrent execution using errgroup.WithContext. - Add cancellation support - first error cancels remaining syncers via context. - Add success log when all syncers complete successfully. - Update SyncerRegistry comment to reflect concurrent execution. - Add comprehensive concurrency tests for SyncerRegistry: - Add parameterized table-driven tests for various syncer combinations. - Implement barrier syncer pattern to verify true concurrent execution. - Add error syncer and cancel-aware syncer test utilities resolves #1130 Signed-off-by: Tsvetan Dimitrov ([email protected])
Signed-off-by: Tsvetan Dimitrov <[email protected]>
Replace ad-hoc helpers with shared utils: - Use utilstest.NewBarrierSyncer, utilstest.NewErrorSyncer, utilstest.NewCancelAwareSyncer. - Use utilstest.WaitGroupWithTimeout, utilstest.WaitErrWithTimeout, utilstest.WaitSignalWithTimeout. - Drop local funcSyncer and duplicated inline helpers. - Switch to require.ErrorIs where applicable and remove redundant require.Error/Contains checks. - No functional behavior changes; tests are cleaner and more explicit.
- Introduce CodeHashSink interface and CodeSyncerTask (sink + task) - Export NewCodeSyncer and make code syncer implement the sink. - Require CodeHashSink in state syncer constructor. - Stop starting code syncer inside state syncer; registry registers both. - Wire vm client to create/register code syncer and pass it to state syncer. - Update tests to start state and code syncers together. resolves #1139 Signed-off-by: Tsvetan Dimitrov ([email protected])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you're quite done yet, but wanted to give my opinions
- Add Ready() method to CodeFetcher. - Implement Ready() in statesync.CodeSyncer (returns its open channel). - Wait on CodeFetcher.Ready() at the start of state_syncer.Sync - Revert VM gating wrapper; register state syncer directly - Add TestCodeSyncerReady using utilstest.NewTestContext and SleepWithContext.
Co-authored-by: Arran Schlosberg <[email protected]>
…acity option - Add typed close state with tiny atomic wrapper (open/finalized/quit). - Always call Init() in NewCodeQueue and remove pre-init gating. - Keep early-quit behavior: background goroutine marks quit and closes out. - Finalize now returns error only on early quit - idempotently closes on clean finalize. - Rename option: WithMaxOutstandingCodeHashes to WithCapacity. - Update tests to reflect auto-init and renamed option.
f334234
to
76e1ce0
Compare
… code syncer (consumer) - CodeQueue: remove in-memory dedupe/filter and persist idempotent markers for all inputs. - CodeSyncer: add in-flight dedupe via sync.Map and skip locally-present code and delete markers. Clean up inFlight on success. - Keep early-quit/finalize semantics unchanged. - Update tests to allow duplicate and present-code enqueues; consumer handles dedupe/retry. - DB note: markers keyed by hash - duplicate persists overwrite, not grow storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few races and some unnecessary code. I'm finishing off some recommended changes to simplify the tests and will send them to you shortly.
Close the code queue's `out` channel only after transitioning to `quit` and coordinating with in-flight enqueues. The quit path now uses atomic state and the enqueue wait group to ensure a single, orderly close, eliminating send/close races seen in CI (`TestCodeQueue_ShutdownDuringEnqueue`). - Add WaitGroup coordination to quit close path - Ensure deterministic single close between quit and Finalize - Preserve Finalize semantics and error reporting - Addresses CI data race warning in sync/statesync.
8d46ec0
to
54c2592
Compare
- Close out on quit only after CAS to quit and waiting for in-flight enqueues - Add `enqueueWG.Add` at start of enqueue to cover whole persistence+send window. - Make `Finalize` CAS to finalized first; then wait and close out - Keep `AddCode` guard to reject after finalize/quit deterministically. - Add `canTransitionToFinalized` helper and update `markQuitAndClose` to wait before close. Fixes data race reported in `TestCodeQueue_ShutdownDuringEnqueue` and ensures a single, orderly channel close between `quit` and `Finalize`.
50ba25d
to
6b1a6e3
Compare
…mmit - Move `inFlight.LoadOrStore` before `rawdb.HasCode` to avoid races and redundant `HasCode` checks across workers. - If code is already present, delete the stale marker and only then release in-flight ownership. - In `fulfillCodeRequest`, release in-flight ownership only after successful `batch.Write` to prevent duplicate fetch/write on write errors. - Reduces DB contention and eliminates duplicate work/races during code syncing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving pending deferral of interface
introduction. At the moment it's not needed, and coupled interface+implementation code is a major problem in our codebases.
- Rely solely on CAS to transition to finalized; remove redundant pre-check. - If CAS succeeds: wait for in-flight enqueues, then close out. - If CAS fails and state is quit: return errFailedToFinalizeCodeQueue. - If CAS fails and state is finalized: no-op (idempotent finalize). - Add didQuit() helper for clear state checks.
…eQueue` directly - Delete `CodeRequestQueue` from `sync/types.go`. - Update `state_syncer` to accept `*CodeQueue` and store it directly. - Update VM sync client to construct and pass `*statesync.CodeQueue`. - Remove interface references/comments and clean imports.
…ck and inlining enqueue - Remove `AddCode` post-finalize/quit switch. Instead rely on API contract and shutdown ordering. - Inline `enqueue` logic into `AddCode` and delete separate `enqueue` function. - Update init to call `AddCode` for recovering persisted markers. - Adjust tests to stop asserting `AddCode` errors after `Finalize` - retain quit error assertions. - Minor comment cleanups for clarity.
- Remove assertions expecting `AddCode` to error after `Finalize`. - Keep early-quit behavior checks (`AddCode` should still error on quit). - No functional changes and tests reflect simplified queue semantics.
Why this should be merged
Check #1139
How this works
Introduce
CodeQueue
(producer of code hashes)quit <-chan struct{}
and supports functional optionWithCapacity(int)
.Supported Operations
AddCode([]common.Hash) error
:quit
triggers.errFailedToAddCodeHashesToQueue
ifquit
has already fired or fires during send.AddCode
calls is not guaranteed - consumer re-batches from the stream.Finalize() error
:CompareAndSwap
to finalized - if successful, waits for in-flight enqueues and closes the output channel.errFailedToFinalizeCodeQueue
. If already finalized, no-op.quit
,CompareAndSwap
s to quit, waits for in-flight enqueues, and closes the output channel to unblock consumers.CodeHashes() <-chan common.Hash
exposed for consumers.Modifications on
CodeSyncer
(consumer of code hashes)WithNumCodeFetchingWorkers(int)
,WithCodeHashesPerRequest(int)
.rawdb.HasCode
. If code present, delete marker and release in-flight. On successful fetch, write code and delete marker in a batch, then release in-flight only after the batch commit.VM client wiring (
plugin/evm/vmsync
)createCodeQueue()
returns*statesync.CodeQueue
viaNewCodeQueue(chainDB, StateSyncDone)
.createCodeSyncer()
passescodeQueue.CodeHashes()
tostatesync.NewCodeSyncer
.createEVMSyncer()
accepts the concrete*statesync.CodeQueue
producer.How this was tested
Modified existing unit tests and add new ones for
CodeQueue
.Need to be documented?
no
Need to update RELEASES.md?
no
resolves #1139
Signed-off-by: Tsvetan Dimitrov ([email protected])