Skip to content

Conversation

powerslider
Copy link
Contributor

@powerslider powerslider commented Aug 22, 2025

Why this should be merged

Check #1139

How this works

Introduce CodeQueue (producer of code hashes)

  • Purpose: Accept code hashes, persist durable "to-fetch" markers, and enqueue hashes for the consumer. No in-memory dedupe or local-code checks - the consumer handles dedupe/skip/retry.
  • Construction: Requires a quit <-chan struct{} and supports functional option WithCapacity(int).
  • Initialization: Auto-inits on construction - recovers persisted "to-fetch" markers and enqueues them immediately.

Supported Operations

  • AddCode([]common.Hash) error:

    • Persists markers for all input hashes (idempotent per hash) and enqueues them as-is.
    • Blocks if the output buffer is full until drained or quit triggers.
    • Returns errFailedToAddCodeHashesToQueue if quit has already fired or fires during send.
    • Does not enforce the "no AddCode after Finalize" rule thus callers are responsible for contract adherence.
    • NOTE: Ordering across separate AddCode calls is not guaranteed - consumer re-batches from the stream.
  • Finalize() error:

    • CompareAndSwap to finalized - if successful, waits for in-flight enqueues and closes the output channel.
    • If already quit, returns errFailedToFinalizeCodeQueue. If already finalized, no-op.
    • Early quit behavior: A goroutine listens on quit, CompareAndSwaps to quit, waits for in-flight enqueues, and closes the output channel to unblock consumers.
  • CodeHashes() <-chan common.Hash exposed for consumers.

Modifications on CodeSyncer (consumer of code hashes)

  • Config Options: WithNumCodeFetchingWorkers(int), WithCodeHashesPerRequest(int).
  • Deduplicate using an in-flight map before checking rawdb.HasCode. If code present, delete marker and release in-flight. On successful fetch, write code and delete marker in a batch, then release in-flight only after the batch commit.

VM client wiring (plugin/evm/vmsync)

  • createCodeQueue() returns *statesync.CodeQueue via NewCodeQueue(chainDB, StateSyncDone).
  • createCodeSyncer() passes codeQueue.CodeHashes() to statesync.NewCodeSyncer.
  • createEVMSyncer() accepts the concrete *statesync.CodeQueue producer.

How this was tested

Modified existing unit tests and add new ones for CodeQueue.

Need to be documented?

no

Need to update RELEASES.md?

no

resolves #1139

Signed-off-by: Tsvetan Dimitrov ([email protected])

- Replace sequential syncer execution with concurrent execution using errgroup.WithContext.
- Add cancellation support - first error cancels remaining syncers via context.
- Add success log when all syncers complete successfully.
- Update SyncerRegistry comment to reflect concurrent execution.
- Add comprehensive concurrency tests for SyncerRegistry:
    - Add parameterized table-driven tests for various syncer combinations.
    - Implement barrier syncer pattern to verify true concurrent execution.
    - Add error syncer and cancel-aware syncer test utilities

resolves #1130

Signed-off-by: Tsvetan Dimitrov ([email protected])
Replace ad-hoc helpers with shared utils:
- Use utilstest.NewBarrierSyncer, utilstest.NewErrorSyncer, utilstest.NewCancelAwareSyncer.
- Use utilstest.WaitGroupWithTimeout, utilstest.WaitErrWithTimeout, utilstest.WaitSignalWithTimeout.
- Drop local funcSyncer and duplicated inline helpers.
- Switch to require.ErrorIs where applicable and remove redundant require.Error/Contains checks.
- No functional behavior changes; tests are cleaner and more explicit.
- Introduce CodeHashSink interface and CodeSyncerTask (sink + task)
- Export NewCodeSyncer and make code syncer implement the sink.
- Require CodeHashSink in state syncer constructor.
- Stop starting code syncer inside state syncer; registry registers both.
- Wire vm client to create/register code syncer and pass it to state syncer.
- Update tests to start state and code syncers together.

resolves #1139

Signed-off-by: Tsvetan Dimitrov ([email protected])
@powerslider powerslider self-assigned this Aug 22, 2025
@powerslider powerslider requested a review from a team as a code owner August 22, 2025 17:43
@powerslider powerslider marked this pull request as draft August 22, 2025 17:43
Copy link
Contributor

@alarso16 alarso16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you're quite done yet, but wanted to give my opinions

Base automatically changed from powerslider/1130-concurrent-syncers to master August 25, 2025 18:17
@powerslider powerslider marked this pull request as ready for review August 26, 2025 13:19
@powerslider powerslider marked this pull request as draft August 26, 2025 13:23
- Add Ready() method to CodeFetcher.
- Implement Ready() in statesync.CodeSyncer (returns its open channel).
- Wait on CodeFetcher.Ready() at the start of state_syncer.Sync
- Revert VM gating wrapper; register state syncer directly
- Add TestCodeSyncerReady using utilstest.NewTestContext and
SleepWithContext.
powerslider and others added 3 commits September 23, 2025 13:19
…acity option

- Add typed close state with tiny atomic wrapper (open/finalized/quit).
- Always call Init() in NewCodeQueue and remove pre-init gating.
- Keep early-quit behavior: background goroutine marks quit and closes out.
- Finalize now returns error only on early quit - idempotently closes on clean finalize.
- Rename option: WithMaxOutstandingCodeHashes to  WithCapacity.
- Update tests to reflect auto-init and renamed option.
@powerslider powerslider force-pushed the powerslider/1139-code-syncer-decoupling branch from f334234 to 76e1ce0 Compare September 23, 2025 11:47
… code syncer (consumer)

- CodeQueue: remove in-memory dedupe/filter and persist idempotent markers for all inputs.
- CodeSyncer: add in-flight dedupe via sync.Map and skip locally-present code and delete markers. Clean up inFlight on success.
- Keep early-quit/finalize semantics unchanged.
- Update tests to allow duplicate and present-code enqueues; consumer handles dedupe/retry.
- DB note: markers keyed by hash - duplicate persists overwrite, not grow
storage.
Copy link
Collaborator

@ARR4N ARR4N left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few races and some unnecessary code. I'm finishing off some recommended changes to simplify the tests and will send them to you shortly.

Close the code queue's `out` channel only after transitioning to `quit`
and coordinating with in-flight enqueues. The quit path now uses atomic
state and the enqueue wait group to ensure a single, orderly close,
eliminating send/close races seen in CI (`TestCodeQueue_ShutdownDuringEnqueue`).

- Add WaitGroup coordination to quit close path
- Ensure deterministic single close between quit and Finalize
- Preserve Finalize semantics and error reporting
- Addresses CI data race warning in sync/statesync.
@powerslider powerslider force-pushed the powerslider/1139-code-syncer-decoupling branch from 8d46ec0 to 54c2592 Compare September 24, 2025 16:11
- Close out on quit only after CAS to quit and waiting for in-flight enqueues
- Add `enqueueWG.Add` at start of enqueue to cover whole persistence+send window.
- Make `Finalize` CAS to finalized first; then wait and close out
- Keep `AddCode` guard to reject after finalize/quit deterministically.
- Add `canTransitionToFinalized` helper and update `markQuitAndClose` to wait before close.

Fixes data race reported in `TestCodeQueue_ShutdownDuringEnqueue` and
ensures a single, orderly channel close between `quit` and `Finalize`.
@powerslider powerslider force-pushed the powerslider/1139-code-syncer-decoupling branch from 50ba25d to 6b1a6e3 Compare September 24, 2025 16:41
…mmit

- Move `inFlight.LoadOrStore` before `rawdb.HasCode` to avoid races and redundant `HasCode` checks across workers.
- If code is already present, delete the stale marker and only then release in-flight ownership.
- In `fulfillCodeRequest`, release in-flight ownership only after successful `batch.Write` to prevent duplicate fetch/write on write errors.
- Reduces DB contention and eliminates duplicate work/races during code syncing.
Copy link
Collaborator

@ARR4N ARR4N left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving pending deferral of interface introduction. At the moment it's not needed, and coupled interface+implementation code is a major problem in our codebases.

- Rely solely on CAS to transition to finalized; remove redundant pre-check.
- If CAS succeeds: wait for in-flight enqueues, then close out.
- If CAS fails and state is quit: return errFailedToFinalizeCodeQueue.
- If CAS fails and state is finalized: no-op (idempotent finalize).
- Add didQuit() helper for clear state checks.
…eQueue` directly

- Delete `CodeRequestQueue` from `sync/types.go`.
- Update `state_syncer` to accept `*CodeQueue` and store it directly.
- Update VM sync client to construct and pass `*statesync.CodeQueue`.
- Remove interface references/comments and clean imports.
…ck and inlining enqueue

- Remove `AddCode` post-finalize/quit switch. Instead rely on API contract and shutdown ordering.
- Inline `enqueue` logic into `AddCode` and delete separate `enqueue` function.
- Update init to call `AddCode` for recovering persisted markers.
- Adjust tests to stop asserting `AddCode` errors after `Finalize` - retain quit error assertions.
- Minor comment cleanups for clarity.
- Remove assertions expecting `AddCode` to error after `Finalize`.
- Keep early-quit behavior checks (`AddCode` should still error on quit).
- No functional changes and tests reflect simplified queue semantics.
@ARR4N ARR4N added this pull request to the merge queue Sep 25, 2025
Merged via the queue into master with commit f0db9b5 Sep 25, 2025
9 checks passed
@ARR4N ARR4N deleted the powerslider/1139-code-syncer-decoupling branch September 25, 2025 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Register the code hash syncer as a separate concurrent syncer in the registry

5 participants