-
Notifications
You must be signed in to change notification settings - Fork 12
Reject transactions that have already been submitted to the tx pool #924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughReplaces time-only EOA activity tracking with per-EOA metadata (last submission time + recent nonces) for nonce-based duplicate detection; removes exported Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant BatchTxPool
participant EOAActivityCache
participant Requester
participant FlowNode
Note over BatchTxPool,EOAActivityCache: Add path with nonce-based dedupe
Client->>BatchTxPool: Submit tx (includes nonce)
BatchTxPool->>EOAActivityCache: Load metadata for EOA
alt metadata not found
BatchTxPool->>Requester: submitSingleTransaction(tx)
else metadata exists
alt time since lastSubmission > TxBatchInterval
BatchTxPool->>BatchTxPool: decide submitSingle or append to batch
BatchTxPool->>Requester: submitBatch or submitSingleTransaction
else within interval
BatchTxPool->>BatchTxPool: append tx to pooled batch (track nonce)
end
end
Requester->>FlowNode: Send transaction(s)
FlowNode-->>Requester: response / error
Requester-->>BatchTxPool: result
BatchTxPool->>EOAActivityCache: update lastSubmission and txNonces (prune)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25–40 minutes
Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
services/requester/batch_tx_pool.go (2)
26-29: EOA activity metadata and cache TTL align with duplicate-detection goalsUsing
eoaActivityMetadatawith a cappedtxHashesslice and a 10seoaActivityCacheTTLmatches the “last 15 tx hashes over ~10s” requirement and keeps lookup cost bounded. If you expect operators to tune the duplicate window over time, consider derivingeoaActivityCacheTTLfrom config instead of hardcoding10 * time.Second, but the current choice is reasonable for now.Also applies to: 36-39, 59-60, 89-93
135-149: Consider recording EOA activity only when submission succeedsThe duplicate check via
slices.Contains(lastActivity.txHashes, txHash)is straightforward and efficient with the 15-hash cap. One nuance is that you always updatelastActivityand add the hash toeoaActivity, even ifsubmitSingleTransaction(or upstream build/send) fails. That means a client retrying after a transient internal error can be rejected as “already submitted” for up to the TTL, even though the transaction never made it to Flow.If you want duplicate protection to apply only to successfully accepted submissions (single or future batched), you can gate the metadata update on
err == nil:- // Update metadata for the last EOA activity - lastActivity.submittedAt = time.Now() - lastActivity.txHashes = append(lastActivity.txHashes, txHash) - // To avoid the slice of hashes from growing indefinitely, - // maintain only a handful of the last tx hashes. - if len(lastActivity.txHashes) > 15 { - lastActivity.txHashes = lastActivity.txHashes[1:] - } - - t.eoaActivity.Add(from, lastActivity) - - return err + // Update metadata for the last EOA activity only on successful add/submit. + if err == nil { + lastActivity.submittedAt = time.Now() + lastActivity.txHashes = append(lastActivity.txHashes, txHash) + // To avoid the slice of hashes from growing indefinitely, + // maintain only a handful of the last tx hashes. + if len(lastActivity.txHashes) > 15 { + lastActivity.txHashes = lastActivity.txHashes[1:] + } + + t.eoaActivity.Add(from, lastActivity) + } + + return errThis keeps the rejection behavior focused on truly duplicated submissions rather than transient failures.
Also applies to: 170-170, 175-175, 179-189
tests/tx_batching_test.go (2)
517-568: Duplicate-submission test accurately exercises hash-based rejectionThe nonce pattern and assertions (exactly two errors, each containing the expected “already submitted” hash) give good coverage of the new hash-based duplicate detection while still verifying that all non-duplicate transactions execute successfully. The only minor trade-off is that the test is now coupled to specific precomputed hashes; if chain ID or signing details change, you’ll need to update those literals. If you want to reduce that coupling, you could derive the expected duplicate hashes inline from the first submissions before asserting on the error messages.
334-335: Update comments to reflect the new 2.5s TxBatchInterval in tests
setupGatewayNodenow setsTxBatchIntervalto 2.5 seconds, but the explanatory comments in the “recent” and “non-recent” interval tests still mention 2 seconds / 3 seconds, which can be misleading when someone tweaks the interval again.Consider updating the comments to reference the actual configured value instead of hardcoded numbers, for example:
- // For the E2E tests the `cfg.TxBatchInterval` is equal - // to 2 seconds. + // For the E2E tests, `cfg.TxBatchInterval` is currently + // set in setupGatewayNode (2.5 seconds at the time of writing).and similarly for the “3 seconds” wording in the non-recent test.
Also applies to: 448-449, 609-610
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
models/errors/errors.go(1 hunks)services/requester/batch_tx_pool.go(6 hunks)tests/tx_batching_test.go(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.gotests/tx_batching_test.go
📚 Learning: 2025-11-12T13:21:59.080Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.080Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (3)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-41)services/requester/tx_pool.go (1)
TxPool(16-18)models/errors/errors.go (1)
ErrInvalid(23-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (1)
models/errors/errors.go (1)
31-32: Transaction error constants remain coherent after duplicate-error removal
ErrFailedTransactionandErrInvalidTransactionare composed correctly, and removing a dedicated duplicate error in favor of genericErrInvalidfor pre-submission checks is consistent with the new batch pool behavior, assuming all references to the old duplicate error were cleaned up.
f7a3610 to
582d698
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
services/requester/batch_tx_pool.go (2)
26-29: Confirm EOA activity TTL vs configurable batch intervalYou’re hard-coding
eoaActivityCacheTTLto 10 seconds while batching behavior still depends on the configurableTxBatchInterval. That’s probably fine (issue description mentions a 10-second window), but it means:
- Duplicate detection will never look back further than 10 seconds, even if
TxBatchIntervalis larger.- Memory usage is capped by 10k EOAs × 15 hashes for that 10s window.
If operators ever run with a much larger
TxBatchInterval, you may want to either:
- Tie
eoaActivityCacheTTLtoTxBatchInterval(e.g.,max(TxBatchInterval, 10s)), or- Document explicitly that duplicate protection is limited to a fixed 10-second horizon.
170-170: Consider extracting15into a named constant for tracked hashesThe pruning logic is clear, but
15as a literal is a bit magic:if err == nil { lastActivity.submittedAt = time.Now() lastActivity.txHashes = append(lastActivity.txHashes, txHash) if len(lastActivity.txHashes) > 15 { lastActivity.txHashes = lastActivity.txHashes[1:] } t.eoaActivity.Add(from, lastActivity) }To make the intent self‑documenting and adjustable, consider:
-const ( - eoaActivityCacheSize = 10_000 - eoaActivityCacheTTL = time.Second * 10 -) +const ( + eoaActivityCacheSize = 10_000 + eoaActivityCacheTTL = 10 * time.Second + maxTrackedTxHashesPerEOA = 15 +)- if err == nil { - lastActivity.submittedAt = time.Now() - lastActivity.txHashes = append(lastActivity.txHashes, txHash) - if len(lastActivity.txHashes) > 15 { - lastActivity.txHashes = lastActivity.txHashes[1:] - } - - t.eoaActivity.Add(from, lastActivity) - } + if err == nil { + lastActivity.submittedAt = time.Now() + lastActivity.txHashes = append(lastActivity.txHashes, txHash) + if len(lastActivity.txHashes) > maxTrackedTxHashesPerEOA { + lastActivity.txHashes = lastActivity.txHashes[1:] + } + + t.eoaActivity.Add(from, lastActivity) + }Pure readability/maintainability improvement; behavior stays the same.
Also applies to: 179-190
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
models/errors/errors.go(1 hunks)services/requester/batch_tx_pool.go(6 hunks)tests/tx_batching_test.go(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- models/errors/errors.go
- tests/tx_batching_test.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (3)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-41)services/requester/tx_pool.go (1)
TxPool(16-18)models/errors/errors.go (1)
ErrInvalid(23-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (2)
services/requester/batch_tx_pool.go (2)
36-39: EOA activity metadata + LRU wiring looks consistentUsing
eoaActivityMetadata{submittedAt, txHashes}in anexpirable.LRUkeyed by EOA, plus thevar _ TxPool = (*BatchTxPool)(nil)assertion, cleanly encapsulates:
- Batching recency (
submittedAtvsTxBatchInterval), and- Short-horizon duplicate detection (
txHashesbounded by cache TTL and array size).The wiring in
NewBatchTxPool(sharedSingleTxPool, map forpooledTxs, mutex, and the expirable LRU) looks coherent with existing concurrency patterns and should behave deterministically under load.Also applies to: 57-60, 62-62, 89-99
135-148: I'll verify the review comment by examining the codebase and checking the Hash type behavior.Now let me verify the code file exists and check the actual implementation:
Incorrect review comment —
%sformatting is correct forcommon.HashThe review comment mischaracterizes how Go's
fmtpackage formatscommon.Hash. TheHashtype implements a customFormatmethod that handles the's'verb by writing the hex representation with a"0x"prefix. SinceHashimplementsfmt.Formatter, the%sverb will produce the proper hex string output (e.g.,"0x1234..."), not the artifact mentioned in the comment.The
Stringmethod implements the stringer interface and produces identical output. The current code is already correct; no changes are needed.Likely an incorrect or invalid review comment.
582d698 to
86c6bda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
services/requester/batch_tx_pool.go (1)
136-149: Duplicate‑submission check and metadata update align with design; consider Publish ordering as a small follow‑upThe new flow in
Add:
- Computes
txHashand rejects whenfound && slices.Contains(lastActivity.txHashes, txHash), so only byte‑identical re‑submissions from the same EOA are blocked.- Uses
lastActivity.submittedAtwithTxBatchIntervalto decide whether to batch or send individually, and- On success, records
submittedAt = time.Now(), appendstxHash, and prunes to the lastmaxTrackedTxHashesPerEOAbefore writing back toeoaActivity.This cleanly achieves the “reject already‑submitted txs within a short, bounded window” goal and keeps batching behavior intact, including the first‑time
!foundcase (where the zero‑value metadata is safely initialized on success).One minor consideration:
t.txPublisher.Publish(tx)is still called before the dedup check, so duplicate submissions that are rejected at this point will still generate “pending transaction” events downstream. If you don’t want observers to see events for txs that are immediately rejected, you could move thePublishcall to just after this validation block.Also applies to: 171-177, 180-191
tests/tx_batching_test.go (2)
335-336: TxBatchInterval test configuration and comments mostly consistent; one stale referenceSetting
TxBatchIntervaltotime.Millisecond * 2500insetupGatewayNodeand documenting 2.5s in the “recent interval” test comment keeps the test environment aligned with mainnet behavior and with the batching logic inBatchTxPool. The only thing to watch for is that the comment inTest_MultipleTransactionSubmissionsWithinNonRecentIntervalstill mentions a 2‑second interval; if you want the docs fully accurate, updating that comment to 2.5 seconds would avoid future confusion.Also applies to: 609-610
517-568: Duplicate‑submission test correctly exercises the new behavior; assertions are quite strictThe new
Test_TransactionSubmissionWithPreviouslySubmittedTransactions:
- Uses a nonce pattern
[0,1,2,3,2,3,4,5]to create two exact‑hash duplicates and verifies that:
- Exactly two submissions fail, and
- Both failures contain the expected “a tx with hash 0x… has already been submitted” messages.
- Then ensures all successfully submitted txs receive successful receipts.
This is a solid end‑to‑end validation of the new dedup logic and confirms that non‑duplicate txs still execute. The only potential nit is test brittleness: it assumes both duplicate submissions fail and that their errors appear in
errors[0]anderrors[1]with those exact hash strings. If you ever change signing parameters or error formatting, the test will need updates. As a small robustness improvement, you could make the assertions order‑insensitive (e.g., check that the two expected substrings appear somewhere in theerrorsslice) rather than bound to specific indices.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
models/errors/errors.go(1 hunks)services/requester/batch_tx_pool.go(6 hunks)tests/tx_batching_test.go(4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-12T13:21:59.080Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.080Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
tests/tx_batching_test.goservices/requester/batch_tx_pool.go
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (4)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-41)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)models/errors/errors.go (1)
ErrInvalid(23-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (2)
models/errors/errors.go (1)
31-32: Composed ErrInvalidTransaction sentinel looks goodDefining
ErrInvalidTransactionas%w: %woverErrInvalidandErrFailedTransactiongives you a single sentinel while preservingerrors.Ismatches for both underlying categories. This keepsNewInvalidTransactionError’s wrapping behavior consistent without introducing new error types.services/requester/batch_tx_pool.go (1)
26-30: Per‑EOA activity metadata and bounded dedup window look appropriateStoring
eoaActivityMetadata{submittedAt, txHashes}in a 10k‑entry expirable LRU, with a 10s TTL and at most 15 hashes per EOA, gives you a clear, bounded dedup window (~150k hashes max in memory) and keeps the batching heuristics (based onsubmittedAtvsTxBatchInterval) independent from the cache’s eviction policy. This matches the PR intent of “last 15 hashes over ~10 seconds” without impacting the existing batching semantics.Also applies to: 37-40, 60-61, 90-94
86c6bda to
62deb8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/tx_batching_test.go (1)
546-556: Hardcoded transaction hashes create fragile test assertions.These assertions rely on deterministic tx hashes derived from fixed inputs (private key, nonces, amounts). While this works, any change to the signing parameters or transaction structure could silently break the test without clear indication of why.
Consider extracting the expected hashes programmatically by signing the duplicate transactions upfront and storing their hashes, then comparing against those:
// Before the loop, pre-compute expected duplicate hashes expectedDupHashes := make(map[common.Hash]bool) for _, nonce := range []uint64{2, 3} { signed, _, _ := evmSign(big.NewInt(1_000_000_000), 23_500, eoaKey, nonce, &testAddr, nil) expectedDupHashes[signed.Hash()] = true } // Then in assertions, verify errors reference hashes in expectedDupHashes
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
models/errors/errors.go(1 hunks)services/requester/batch_tx_pool.go(5 hunks)tests/tx_batching_test.go(6 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-12T13:21:59.080Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.080Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
tests/tx_batching_test.goservices/requester/batch_tx_pool.go
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (2)
tests/tx_batching_test.go (1)
config/config.go (2)
TxStateValidation(37-37)LocalIndexValidation(40-40)
services/requester/batch_tx_pool.go (4)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-41)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)models/errors/errors.go (1)
ErrInvalid(23-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (9)
models/errors/errors.go (1)
31-32: LGTM!The removal of
ErrDuplicateTransactionis appropriate since duplicate detection is now handled via per-EOA metadata inBatchTxPool, with the error constructed inline usingerrs.ErrInvalid.tests/tx_batching_test.go (2)
620-622: Sleep-based synchronization for indexing.The 2-second sleep runs concurrently with
bootstrap.Runto allow indexing to catch up. While this works, it adds to test execution time. If flakiness occurs, consider increasing the timeout or using a more deterministic readiness check.
607-609: LGTM!Configuration changes align with the new duplicate detection approach:
LocalIndexValidationenables local index-based validation, and the 2.5-second batch interval matches mainnet behavior.services/requester/batch_tx_pool.go (6)
26-30: LGTM!The constants provide sensible defaults: 10-second TTL for activity tracking, 15 tracked hashes per EOA, and 10,000 cache size. These values align with the PR objective to reduce noise from bot resubmissions.
37-40: LGTM!The
eoaActivityMetadatastruct cleanly encapsulates per-EOA tracking data. Using a value type with a small slice (max 15 hashes) ensures efficient cache operations.
138-148: Duplicate detection logic is correct.The check using
slices.Containson a max-15 element slice is efficient. The error wrapserrs.ErrInvalidwith an informative message including the transaction hash, which helps with debugging.
170-180: Good correctness fix for transaction ordering.The added check
len(t.pooledTxs[from]) > 0ensures that if there are pending pooled transactions for this EOA, subsequent transactions are added to the same batch rather than submitted individually. This prevents potential ordering issues where a "stale" transaction could race ahead of earlier pooled transactions.
191-200: Metadata update logic is sound.The pattern correctly:
- Only updates on successful submission (after error check)
- Uses FIFO pruning (drops oldest hash first)
- Stores the updated value back to the cache
Note: When
!found,eoaActivityis the zero-value ofeoaActivityMetadata(emptytxHashesslice, zerolastSubmission), which is correctly populated beforeAdd.
217-220: Map swap pattern is thread-safe.The pattern of assigning
txsGroupedByAddress := t.pooledTxsand then creating a new mapt.pooledTxs = make(...)under lock effectively transfers ownership of the old map (and its slice values) to the processing goroutine. This avoids the need for deep copying sinceAdd()will only write to the new map.Based on learnings about deep copying maps with slices, this swap approach is a valid alternative that achieves the same safety guarantees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
services/requester/batch_tx_pool.go (3)
26-29: Consider documenting / de‑magic‑numbering cache sizing constants
eoaActivityCacheSize = 10_000andmaxTrackedTxHashesPerEOA = 15look reasonable, but they’re opaque to a reader. A brief comment tying these to expected traffic patterns and the 10s window (or making them configurable) would make future tuning safer.
41-52: Top‑level comment could mention duplicate‑tx rejectionThe BatchTxPool comment still describes batching/ordering behavior well, but it doesn’t mention the new “recent hash tracking + duplicate rejection” behavior. Adding a short sentence about rejecting recently submitted tx hashes would help future readers understand why
eoaActivityCacheexists.
129-142: Duplicate tx hash rejection logic is sound; consider observabilityThe
slices.Contains(eoaActivity.txHashes, txHash)check correctly rejects re‑submitted txs for the same EOA and wraps them aserrs.ErrInvalid. One thing you might want is a metric/log counter (e.g., a dedicated “duplicate txs rejected” metric, or treating them as dropped) so operators can see how noisy certain EOAs/bots are; right now they’re silent from a metrics perspective.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
services/requester/batch_tx_pool.go(5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (7)
services/requester/batch_tx_pool.go (7)
6-6: fmt import usage looks correct
fmtis used only for the wrapped invalid‑tx error; no issues here.
36-39: Per‑EOA activity metadata struct is well‑factoredBundling
lastSubmissionandtxHashesintoeoaActivityMetadatakeeps the Add() logic clear, and the zero value is safe on cache misses. No changes needed.
56-59: BatchTxPool fields align with intended behavior
pooledTxs+txMuxfor batching andeoaActivityCachefor per‑EOA activity/duplicate tracking is a clean separation of concerns. No structural or concurrency issues spotted here.
61-61: Nice use of interface assertion idiom
var _ TxPool = (*BatchTxPool)(nil)is the preferred compile‑time check; good improvement.
88-98: Verify EOAActivityCache TTL is compatible with batching interval
eoaActivityCacheusesconfig.EOAActivityCacheTTLwhile batching decisions useconfig.TxBatchInterval. To avoid surprising behavior (e.g., metadata expiring while we still expect batching / duplicate protection), it would be good to ensureEOAActivityCacheTTL >= TxBatchIntervaland document that contract in config or a comment.
161-188: Add() batching vs single‑submission control flow looks correctThe three scenarios based on
foundandtime.Since(eoaActivity.lastSubmission)are consistent, and thehasBatchpattern correctly appends to any existing batch while falling back to a single submission when none exists.errhandling is safe in all branches (only set when callingsubmitSingleTransaction, and checked once afterward).
193-203: Metadata update and hash list trimming behave as intendedUpdating
eoaActivityonly after a successful add/submit and then appendingtxHash(while trimming to the lastmaxTrackedTxHashesPerEOA) matches the “track last N recent hashes” requirement and keeps memory bounded. The logic to drop the oldest entry viatxHashes[1:]is simple and correct given the small fixed N.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
services/requester/batch_tx_pool.go (1)
162-178: Minor TOCTOU window in batch checking.There's a small window between releasing the lock (line 172) and calling
submitSingleTransaction(line 177) where another goroutine could add transactions topooledTxs[from]. However, the impact is minimal—worst case is the transaction gets submitted individually when it could have been batched, which doesn't affect correctness.If you want to tighten this, you could hold the lock during
submitSingleTransactionfor Case 2, or restructure to check-and-submit atomically. Given the low impact, this is optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
services/requester/batch_tx_pool.go(5 hunks)tests/tx_batching_test.go(6 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.gotests/tx_batching_test.go
📚 Learning: 2025-11-12T13:21:59.080Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.080Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (4)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-40)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)models/errors/errors.go (1)
ErrInvalid(23-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (12)
services/requester/batch_tx_pool.go (8)
26-39: LGTM!The constants and
eoaActivityMetadatatype are well-structured for tracking per-EOA activity and recent transaction nonces.
53-61: LGTM!The struct fields and interface assertion are appropriate. Using
(*BatchTxPool)(nil)for interface assertion is idiomatic Go.
88-98: LGTM!The cache initialization properly uses
eoaActivityMetadataand the configured TTL.
129-140: LGTM!The duplicate nonce detection is straightforward and correct. Using
slices.Containsis appropriate given the small max size of 15.
191-200: Verify metadata update happens only on success.The metadata update correctly happens after confirming
err == nil(line 187). The nonce trimming at lines 196-198 keeps only the most recent 15 nonces by removing from the front—this is intentional behavior to bound memory while still catching recent duplicates.
217-220: LGTM!The map swap pattern is thread-safe here: by replacing
t.pooledTxswith a new empty map under the lock, subsequentAdd()calls operate on the new map while the batch processor iterates over the old one. Based on learnings, this avoids the deep-copy concern because no concurrent modifications occur on the same slices.
245-249: LGTM!Sorting by nonce before batch submission ensures correct execution order regardless of arrival order or any prior reordering.
283-312: LGTM!Clean implementation for single transaction submission with appropriate error handling and metrics recording.
tests/tx_batching_test.go (4)
335-335: LGTM!Comment correctly updated to reflect the new 2.5-second batch interval.
517-568: LGTM!The test correctly validates duplicate nonce rejection:
- Submits nonces
[0, 1, 2, 3, 2, 3, 4, 5]where2and3are intentional duplicates- Verifies exactly 2 errors with the expected messages
- Confirms the 6 successful transactions are executed
This effectively covers the new duplicate submission protection.
607-622: Configuration changes align with production settings.
LocalIndexValidationis appropriate for integration tests- The 2.5-second batch interval matches mainnet
- The 2-second sleep allows indexing to catch up
One minor concern: the fixed 2-second sleep could cause flakiness if indexing occasionally takes longer. Consider using
Eventuallyor a readiness check if tests become flaky.
449-449: LGTM!Comment correctly updated to match the 2.5-second batch interval.
services/requester/batch_tx_pool.go
Outdated
| // Reject transactions that have already been submitted, | ||
| // as they are *likely* to fail. | ||
| if found && slices.Contains(eoaActivity.txNonces, nonce) { | ||
| return fmt.Errorf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we return an error here, or just silently drop the tx? What happens on a geth node?
IIRC, bots will commonly resubmit tx at a specific nonce to increase the gas price in an attempt to get it to be executed earlier in the block, or decrease the gas price to effectively cancel the tx. I'm not totally sure, but I suspect these bots would expect submitting the tx to succeed (even if it was never executed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, matching the functionality from Geth node, a silent drop seems more reasonable here. If we return an error, that might make bots try harder with more requests 😅
Updated in 85a2560 . Also added an info log line, so that we can track all such cases.
services/requester/batch_tx_pool.go
Outdated
| eoaActivity.lastSubmission = time.Now() | ||
| eoaActivity.txNonces = append(eoaActivity.txNonces, nonce) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to protect these writes? Is it possible another request is concurrently updating/reading them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point 👍 Added some lock-protection in 0b94bc1 .
services/requester/batch_tx_pool.go
Outdated
| const eoaActivityCacheSize = 10_000 | ||
| const ( | ||
| eoaActivityCacheSize = 10_000 | ||
| maxTrackedTxNoncesPerEOA = 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 15 enough? I think the data we'd need to cache is small enough we could use a larger value here without tying up too much memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've updated this value to 30 in 70b0bba .
Co-authored-by: Peter Argue <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
services/requester/batch_tx_pool.go (1)
129-201: Race condition in read-modify-write ofeoaActivityMetadata.The metadata is read at line 129, modified at lines 192-193, and written back at line 201. If two goroutines process transactions from the same EOA concurrently, one's nonce update can be lost:
- Goroutine A: reads
txNonces = [1, 2]- Goroutine B: reads
txNonces = [1, 2]- Goroutine A: appends 3 →
[1, 2, 3], writes to cache- Goroutine B: appends 4 →
[1, 2, 4], writes to cache (nonce 3 lost)This also allows duplicates to slip through the check at line 134.
Consider protecting the entire read-modify-write sequence. One approach:
+var eoaActivityMux sync.Mutex // or use a per-EOA lock map for better concurrency func (t *BatchTxPool) Add( ctx context.Context, tx *gethTypes.Transaction, ) error { // ... early code ... + eoaActivityMux.Lock() + defer eoaActivityMux.Unlock() + eoaActivity, found := t.eoaActivityCache.Get(from) nonce := tx.Nonce() // ... rest of the method ...Alternatively, use a
sync.Mapkeyed by EOA for per-address locking to maintain concurrency across different EOAs.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
services/requester/batch_tx_pool.go(5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (4)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-40)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)models/errors/errors.go (1)
ErrInvalid(23-23)
🔇 Additional comments (7)
services/requester/batch_tx_pool.go (7)
26-29: Constants look reasonable.The cache size of 10,000 EOAs and 15 tracked nonces per EOA provide a good balance between memory usage and duplicate detection coverage for typical trading bot patterns.
36-39: Clean metadata struct design.The
eoaActivityMetadatastruct effectively encapsulates both timing and nonce history tracking for duplicate detection.
88-98: Initialization looks correct.The
eoaActivityCacheis properly initialized with configurable TTL, and the pool map is correctly allocated.
132-140: Verify intended behavior for replacement transactions.Standard Ethereum behavior allows replacing a pending transaction by resubmitting the same nonce with a higher gas price. The current implementation rejects any transaction with a previously-seen nonce, which blocks legitimate replacement transactions.
If this is intentional to reduce bot noise, consider documenting this deviation from standard behavior. Otherwise, you could compare gas prices before rejecting.
159-185: Branching logic is sound.The three cases are handled correctly, and the mutex protects
pooledTxsmodifications appropriately. The case 2 handling (checking for existing batch before deciding to submit individually) prevents transaction ordering issues.
218-221: Safe map swap pattern.The swap-and-replace pattern correctly avoids the need for deep copying - after line 220, the old slices in
txsGroupedByAddressare exclusively owned by this goroutine whileAdd()works with the fresh empty map. Based on learnings about map copy safety, this approach is correct.
246-250: Nonce-based sorting ensures correct execution order.Sorting pooled transactions by nonce before batch submission is essential for proper EVM execution semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
services/requester/batch_tx_pool.go (1)
190-192: Nonce marked as submitted before batch actually executes.For batched transactions, the nonce is recorded at line 192, but the actual Flow transaction is sent later in
processPooledTransactions(line 223). If batch submission fails (e.g., at line 228), the nonce remains tracked, causing legitimate retry attempts to be rejected until the cache entry expires (15s).This was raised in past reviews. Consider updating the error message to be more accurate:
if found && slices.Contains(eoaActivity.txNonces, nonce) { + t.logger.Info(). + Str("evm_tx", tx.Hash().Hex()). + Str("from", from.Hex()). + Uint64("nonce", nonce). - Msg("tx with same nonce has been already submitted") + Msg("tx with same nonce was recently submitted or is pending batch execution") return nil }Based on learnings, past review feedback on this concern.
🧹 Nitpick comments (1)
services/requester/batch_tx_pool.go (1)
34-37: Consider documenting the concurrency model.The
eoaActivityMetadatastruct is read and written without explicit locking. While the LRU cache itself is thread-safe, the pattern of Get-modify-Add creates a race window where concurrent requests for the same EOA could overwrite each other's nonce updates.Add a comment clarifying the concurrency expectations:
+// eoaActivityMetadata tracks recent transaction activity for an EOA. +// Note: Updates are not atomic across Get/Add operations. Concurrent +// submissions from the same EOA may result in lost nonce tracking, but +// this is acceptable as it only affects the duplicate detection heuristic. type eoaActivityMetadata struct { lastSubmission time.Time txNonces []uint64 }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cmd/run/cmd.go(1 hunks)services/requester/batch_tx_pool.go(4 hunks)services/requester/requester.go(1 hunks)services/requester/single_tx_pool.go(3 hunks)tests/tx_batching_test.go(6 hunks)
✅ Files skipped from review due to trivial changes (1)
- services/requester/requester.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/single_tx_pool.gotests/tx_batching_test.go
📚 Learning: 2025-11-12T13:21:59.080Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.080Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
🧬 Code graph analysis (2)
services/requester/single_tx_pool.go (2)
services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)
tests/tx_batching_test.go (1)
config/config.go (2)
TxStateValidation(37-37)LocalIndexValidation(40-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (13)
cmd/run/cmd.go (2)
293-293: TTL increase extends the duplicate detection window.The default increased from 10s to 15s, meaning transactions from the same EOA with the same nonce will be rejected as duplicates for a longer period. This aligns with the batching interval changes and helps reduce noise from trading bots, but ensure this duration is appropriate for your use case.
296-299: LGTM! Defensive error handling added.The panic ensures any misconfiguration in flag deprecation is caught early during initialization.
tests/tx_batching_test.go (4)
335-335: LGTM! Comments updated to match configuration.The interval references now correctly reflect the 2.5-second
TxBatchIntervalused in tests.Also applies to: 449-449
517-517: LGTM! Clearer test name.The new name better conveys that the test verifies rejection of previously submitted transactions.
592-592: LGTM! Validation mode appropriate for batch testing.Using
LocalIndexValidationis correct for batch mode, as it validates against the local state index without waiting for transaction sealing.
594-594: Test timing setup looks reasonable.The 2.5-second batch interval matches mainnet, and the 2-second sleep allows the local index to catch up. If this becomes flaky on slower systems, consider using
assert.Eventuallyinstead of a fixed sleep.Also applies to: 605-607
services/requester/batch_tx_pool.go (4)
24-27: LGTM! Reasonable cache sizing.The cache can track 10,000 EOAs with up to 30 recent nonces each, providing good coverage without excessive memory usage.
161-184: LGTM! Batching logic handles pending batches correctly.The check at line 166 ensures that if an EOA has pending transactions in the batch pool (due to congestion), subsequent transactions are added to that batch rather than submitted individually, maintaining proper ordering.
190-202: EOA metadata update happens after successful submission.The metadata is only updated after successful add/submit (line 186 error check), which is correct. The pruning logic properly maintains the most recent
maxTrackedTxNoncesPerEOAentries.However, as noted in past reviews, concurrent updates to the same EOA's metadata are not atomic. The Get-modify-Add pattern (lines 127, 191-192, 200) can lead to lost updates if two requests for the same EOA interleave. This is acceptable for a best-effort duplicate detection heuristic but should be documented.
127-139: Duplicate transaction rejection is intentional—clarify whether this aligns with your API expectations.The code silently drops transactions with duplicate nonces. The test
Test_TransactionSubmissionWithPreviouslySubmittedTransactionsexplicitly validates this by sending nonces[0, 1, 2, 3, 2, 3, 4, 5]and expecting only 6 unique transactions to succeed.This behavior differs from standard Ethereum nodes, which allow replacement transactions (same nonce, higher gas price). However, the code comment ("as they are likely to fail") indicates this is intentional design for the Flow EVM gateway—not a bug. The implementation provides no mechanism for gas price comparison, as
pooledEvmTxstores only the nonce, not gas price.Before treating this as a bug, confirm:
- Is transaction replacement a required feature for your use case?
- If yes, this would require structural changes (storing gas price, comparing replacements).
- If no, add clarification to the code comment that replacement transactions are intentionally not supported.
services/requester/single_tx_pool.go (3)
42-42: LGTM! Cleaner interface assertion style.Using
(*SingleTxPool)(nil)is preferred as it avoids allocation compared to&SingleTxPool{}.
152-153: LGTM! Log field naming consistency improved.The field names now use underscores and clearer suffixes (
flow_tx,evm_tx), consistent with logging elsewhere in the codebase.
190-194: LGTM! Helpful concurrency comment and code simplification.The comment clarifies that
keystore.Take()uses channel-based access, eliminating the need for additional locking. The direct call simplifies the code by removing the wrapper function.Based on learnings, the keystore pattern intentionally holds keys until transaction sealing.
bb0e2a5 to
0b94bc1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/tx_batching_test.go (1)
517-553: Test relies on implicit validation via balance check.The test correctly expects no errors from
sendRawTxsince duplicate nonce transactions are now silently dropped (per the updatedBatchTxPool.Addbehavior). TheexpectedBalance = 6 * transferAmountassertion implicitly validates that only 6 unique nonce transactions were executed.This approach is valid given the silent-drop behavior. For additional observability in test failures, consider adding a log or comment clarifying that indices 4 and 5 (nonces 2 and 3) are expected to be silently dropped.
services/requester/batch_tx_pool.go (1)
127-139: Potential race in duplicate detection.The duplicate check at line 127 occurs without holding
txMux, while the metadata update at lines 198-208 happens under the lock. Two concurrentAdd()calls for the same EOA and nonce could both pass the duplicate check before either updates the cache, resulting in duplicate submissions.Given this is a best-effort optimization to reduce (not guarantee elimination of) duplicate transactions, the race window is small, and adding a lock around the entire
Add()method could impact throughput, this may be acceptable. However, if stricter deduplication is desired:func (t *BatchTxPool) Add( ctx context.Context, tx *gethTypes.Transaction, ) error { t.txPublisher.Publish(tx) // publish pending transaction event + t.txMux.Lock() + defer t.txMux.Unlock() from, err := models.DeriveTxSender(tx)This would serialize all
Add()calls but guarantee no duplicate submissions slip through.Also applies to: 194-210
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
api/utils.go(0 hunks)services/requester/batch_tx_pool.go(7 hunks)services/requester/requester.go(1 hunks)services/requester/single_tx_pool.go(5 hunks)tests/tx_batching_test.go(6 hunks)
💤 Files with no reviewable changes (1)
- api/utils.go
🚧 Files skipped from review as they are similar to previous changes (2)
- services/requester/requester.go
- services/requester/single_tx_pool.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-12T13:21:59.080Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.080Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
tests/tx_batching_test.go
📚 Learning: 2025-03-07T01:35:09.751Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 772
File: services/requester/keystore/key_store.go:50-62
Timestamp: 2025-03-07T01:35:09.751Z
Learning: In the flow-evm-gateway codebase, panics are acceptable in scenarios where immediate detection of critical bugs is desired during development and testing, particularly for invariant violations that should never occur in a correctly functioning system (e.g., when a key is available but locked in the keystore implementation).
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (2)
tests/tx_batching_test.go (1)
config/config.go (2)
TxStateValidation(37-37)LocalIndexValidation(40-40)
services/requester/batch_tx_pool.go (3)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-40)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (10)
tests/tx_batching_test.go (3)
335-335: LGTM!Comment correctly updated to reflect the 2.5 seconds
TxBatchIntervalconfiguration used in E2E tests.
449-449: LGTM!Consistent documentation update for the
TxBatchIntervalvalue.
592-594: LGTM!Configuration updates align with the PR objectives:
LocalIndexValidationenables local index-based transaction validation.TxBatchIntervalof 2.5 seconds matches mainnet behavior as noted in the comment.services/requester/batch_tx_pool.go (7)
24-37: LGTM!Well-structured constants and metadata type:
eoaActivityCacheSizeof 10,000 provides reasonable capacity.maxTrackedTxNoncesPerEOAof 30 (updated from 15 per reviewer feedback) balances memory usage with duplicate detection window.eoaActivityMetadatacleanly encapsulates the tracking data.
51-100: LGTM!The
BatchTxPoolstruct and constructor are well-organized:
- Clean embedding of
SingleTxPoolfor shared functionality.- Proper initialization of the expirable LRU cache with configurable TTL.
- Background goroutine for batch processing is correctly started.
127-139: LGTM!Duplicate nonce detection with silent drop:
- Matches geth behavior for transaction replacement scenarios (per PR discussion).
- Info-level logging provides observability for tracking such cases.
- Linear search on 30 nonces is negligible overhead.
158-184: LGTM!The three-case handling logic is well-structured:
- Case 1 & 2 (no batch): Submit individually for immediate execution.
- Case 2 (with pending batch): Append to existing batch to maintain ordering.
- Case 3 (recent activity): Pool for batching.
The mutex correctly protects
pooledTxsaccess in each case.
237-241: LGTM!Error logging correctly identifies the EOA address for debugging batch submission failures. The
continuebehavior appropriately processes remaining addresses despite individual failures.
284-289: LGTM!Consistent error handling with appropriate metric recording for dropped transactions on send failure.
318-323: LGTM!Consistent error handling pattern with the batch submission path. Single transaction drop is correctly recorded on send failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
services/requester/batch_tx_pool.go (3)
127-139: Nonce-only deduplication changes replacement semantics; confirm this is acceptableThe dedup check drops any subsequent tx from an EOA that reuses a tracked nonce, regardless of payload (gas price, data, etc.). This is great for suppressing noisy “nonce too low” bot spam, but it also disables geth-style replacement / gas-bump semantics at the gateway level.
If that trade-off is intentional, this is fine; if you want something closer to geth behavior later, you may need to:
- Include tx hash (or at least a hash-of-core-fields) in the metadata, and/or
- Allow higher-fee replacements while still blocking exact duplicates.
For now, I’d suggest documenting this behavior clearly in operator/user-facing docs so the deviation from geth expectations is explicit.
186-193: Consider including structured fields for failed single submissionsThe error log only embeds the EOA in the message. For debugging noisy or failing clients it may help to add structured fields (e.g.
evm_tx,from,nonce) instead of relying on formatted strings, to match the structured logging style you use elsewhere.
238-245:TransactionsDroppedmetrics semantics with retriesIn the batched path,
batchSubmitTransactionsForSameAddressincrementsTransactionsDropped(len(hexEncodedTxs))on both build and send errors, butprocessPooledTransactionsthen requeues the samepooledTxsslice for retry. That means txs which eventually succeed will still be counted as “dropped”, potentially inflating this metric.In the single-submit path, by contrast, failures are final drops, so the metric is accurate there.
If
TransactionsDroppedis meant to represent definitive loss, consider:
- Only incrementing it when you decide not to requeue, or
- Introducing a separate metric (e.g. “send failures” or “batch retries”) for transient errors while leaving
TransactionsDroppedfor permanent ones.This is observability-only; behavior is otherwise correct.
Also applies to: 288-292, 322-326
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
services/requester/batch_tx_pool.go(7 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-03-07T01:35:09.751Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 772
File: services/requester/keystore/key_store.go:50-62
Timestamp: 2025-03-07T01:35:09.751Z
Learning: In the flow-evm-gateway codebase, panics are acceptable in scenarios where immediate detection of critical bugs is desired during development and testing, particularly for invariant violations that should never occur in a correctly functioning system (e.g., when a key is available but locked in the keystore implementation).
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (3)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-40)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (3)
services/requester/batch_tx_pool.go (3)
24-27: Per-EOA activity cache and metadata struct look soundBounded cache size and per-EOA nonce history give a predictable memory footprint while supporting recent-activity checks; the
eoaActivityMetadatashape aligns with how you use it inAdd. No changes requested here.Also applies to: 34-37
161-177: Batch vs single submission decision flow reads correctlyThe split between:
- “no activity” → single submit,
- “activity older than
TxBatchInterval” → reuse existing batch if present, otherwise single submit,- “activity within
TxBatchInterval” → pool for batching,matches the comments and seems like a good balance between latency and batching efficiency, especially with the extra check to attach to an in-flight batch when it still exists.
Also applies to: 180-184
54-57: Batch pool struct wiring and cache construction look correctEmbedding
SingleTxPool, wiringpooledTxswithtxMux, and constructingeoaActivityCacheviaexpirable.NewLRUall line up with how the type is later used. Thevar _ TxPool = (*BatchTxPool)(nil)assertion is also the idiomatic pointer-style check.Also applies to: 59-59, 86-96
services/requester/batch_tx_pool.go
Outdated
| t.txMux.Lock() | ||
| defer t.txMux.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the eoaActivity object is still read outside of the lock in Add(). I think we need to add the mutex to eoaActivityMetadata and aquire it each time we need to interact with the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this approach, but got a rather serious warning:
call of t.eoaActivityCache.Add copies lock value:
github.com/onflow/flow-evm-gateway/services/requester.eoaActivityMetadata contains sync.MutexIn 41c7839, I changed the lock acquisition to what we had before.
This ensures that only one requests will:
- read the EOA activity from the cache
- submit the incoming transaction
- update the EOA activity with the submitted transaction nonce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
services/requester/batch_tx_pool.go (1)
127-142: Verify thread safety and consider the tracking timing.The
txMuxlock protects this critical section, but there's a subtle concern:eoaActivityCache.Get(from)returns the struct by value, meaning thetxNoncesslice header in the returned copy points to the same backing array as other copies. Later, at line 324 inupdateEOAActivityMetadata, you append to this slice, potentially modifying or reallocating that shared backing array.While the lock held throughout
Add()prevents concurrent modification for the same EOA, a defensive approach is to treattxNoncesas immutable and always copy before mutating. This aligns with the pattern for deep-copying maps with slices to avoid shared references.Additionally, note that for batched transactions, the nonce is marked as "submitted" at line 193 before the batch is actually sent in
processPooledTransactions. If batch submission fails (lines 226-230), the nonce remains tracked until TTL expiration, causing retries to be rejected. The re-queueing logic at lines 229-230 mitigates this, but retries from the client would still be dropped during the TTL window.Based on learnings, deep copying slices in map values prevents race conditions.
Consider applying the defensive slice copy pattern in
updateEOAActivityMetadata:func (t *BatchTxPool) updateEOAActivityMetadata( from gethCommon.Address, nonce uint64, ) { // Update metadata for the last EOA activity only on successful add/submit. eoaActivity, _ := t.eoaActivityCache.Get(from) eoaActivity.lastSubmission = time.Now() - eoaActivity.txNonces = append(eoaActivity.txNonces, nonce) + + // Copy existing slice before appending to avoid shared backing arrays. + newTxNonces := make([]uint64, len(eoaActivity.txNonces), len(eoaActivity.txNonces)+1) + copy(newTxNonces, eoaActivity.txNonces) + newTxNonces = append(newTxNonces, nonce) + // To avoid the slice of nonces from growing indefinitely, // keep only the last `maxTrackedTxNoncesPerEOA` nonces. - if len(eoaActivity.txNonces) > maxTrackedTxNoncesPerEOA { - firstKeep := len(eoaActivity.txNonces) - maxTrackedTxNoncesPerEOA - eoaActivity.txNonces = eoaActivity.txNonces[firstKeep:] + if len(newTxNonces) > maxTrackedTxNoncesPerEOA { + firstKeep := len(newTxNonces) - maxTrackedTxNoncesPerEOA + newTxNonces = newTxNonces[firstKeep:] } + eoaActivity.txNonces = newTxNonces t.eoaActivityCache.Add(from, eoaActivity) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
services/requester/batch_tx_pool.go(7 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-03-07T01:35:09.751Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 772
File: services/requester/keystore/key_store.go:50-62
Timestamp: 2025-03-07T01:35:09.751Z
Learning: In the flow-evm-gateway codebase, panics are acceptable in scenarios where immediate detection of critical bugs is desired during development and testing, particularly for invariant violations that should never occur in a correctly functioning system (e.g., when a key is available but locked in the keystore implementation).
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (10)
services/requester/batch_tx_pool.go (10)
24-27: LGTM!The cache size and nonce tracking limits are well-balanced for production use.
34-37: Struct design is clean.The metadata struct appropriately tracks per-EOA submission state. Since this is stored in a cache that returns by value, slice mutations must be handled carefully to avoid shared backing arrays.
51-57: LGTM!The updated struct fields and type changes properly support the new nonce-based duplicate detection with per-EOA metadata tracking.
59-59: LGTM!The pointer-type assertion is correct for the interface implementation.
86-96: LGTM!The cache initialization with TTL-based expiration provides proper automatic cleanup of stale EOA metadata.
164-178: LGTM!The logic correctly handles the edge case where pooled transactions haven't been processed yet due to congestion, ensuring the current transaction joins that batch to maintain ordering guarantees.
185-195: Error handling structure is correct.The metadata update at line 193 correctly happens only after successful single transaction submission (error check at line 185). For batched transactions, the metadata is updated immediately upon adding to the batch, which is before actual submission—this timing is acceptable given the re-queueing logic for failed batches.
222-230: Re-queueing logic is sound.The error handling correctly re-queues failed batches for retry while protecting the
pooledTxsmap withtxMux. Note that the nonces from these transactions are already tracked ineoaActivityCache(line 193), so duplicate submissions from clients during the TTL window will be rejected even while the batch is being retried.
273-276: LGTM!The updated comments clearly document that dropped transaction metrics are recorded for both batch and single submission failures.
Also applies to: 308-310
317-333: Function logic is correct, but see earlier comment about defensive copying.The metadata update logic correctly maintains the nonce history and enforces the size limit. The slice mutation concern was addressed in the earlier comment about lines 127-142.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
services/requester/batch_tx_pool.go (1)
283-337: Async pattern doesn't reduce lock hold time.The comment at line 290 states the lock shouldn't be held long, but the goroutine + select pattern at lines 296-334 still blocks the caller (holding
txMux) for up to 3 seconds. This doesn't reduce lock contention; it just adds complexity. The async pattern would only help if the lock were released before waiting.Consider either:
- Release
txMuxbefore callingsubmitSingleTransaction(requires careful ordering of metadata updates)- Make this truly non-blocking by removing the wait (but then error handling becomes more complex)
The current implementation works but defeats the purpose of the async pattern stated in the comments.
🧹 Nitpick comments (1)
services/requester/batch_tx_pool.go (1)
339-355: Consider defensive copying oftxNoncesslice to prevent potential races.While the current
txMuxlocking makes this safe, the in-place mutation at lines 346 and 351 shares the underlying array across cache copies. Based on learnings about slice handling in concurrent map access, consider using immutable slice updates:// Copy existing slice before appending to avoid shared references newTxNonces := make([]uint64, len(eoaActivity.txNonces), len(eoaActivity.txNonces)+1) copy(newTxNonces, eoaActivity.txNonces) newTxNonces = append(newTxNonces, nonce) // Prune if needed if len(newTxNonces) > maxTrackedTxNoncesPerEOA { firstKeep := len(newTxNonces) - maxTrackedTxNoncesPerEOA prunedNonces := make([]uint64, len(newTxNonces)-firstKeep) copy(prunedNonces, newTxNonces[firstKeep:]) newTxNonces = prunedNonces } eoaActivity.txNonces = newTxNoncesThis prevents future refactoring from inadvertently introducing races if the locking strategy changes.
Based on learnings, deep copying slices in concurrent map access is a best practice.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
services/requester/batch_tx_pool.go(7 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-03-07T01:35:09.751Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 772
File: services/requester/keystore/key_store.go:50-62
Timestamp: 2025-03-07T01:35:09.751Z
Learning: In the flow-evm-gateway codebase, panics are acceptable in scenarios where immediate detection of critical bugs is desired during development and testing, particularly for invariant violations that should never occur in a correctly functioning system (e.g., when a key is available but locked in the keystore implementation).
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (3)
services/requester/single_tx_pool.go (1)
SingleTxPool(28-40)services/requester/tx_pool.go (1)
TxPool(16-18)api/pool.go (1)
TxPool(8-8)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (7)
services/requester/batch_tx_pool.go (7)
24-27: LGTM!The cache size and nonce tracking limits are well-chosen for handling high-frequency bot traffic while maintaining reasonable memory usage.
34-37: LGTM!The metadata structure appropriately captures both temporal and nonce-based submission history for duplicate detection.
51-57: LGTM!The struct composition appropriately extends
SingleTxPoolwith batching capabilities and per-EOA activity tracking.
61-101: LGTM!The initialization properly sets up the batch pool with the LRU cache and background processing goroutine.
130-142: LGTM! Duplicate detection correctly implemented.The nonce-based duplicate detection appropriately rejects transactions that have been recently submitted from the same EOA, reducing redundant work. The silent drop with info logging matches typical node behavior for duplicate submissions.
161-183: LGTM! Batching logic correctly handles multiple submission scenarios.The three-case logic appropriately decides between individual submission and batching based on EOA activity timing. The addition at lines 165-172 correctly handles the edge case where an EOA has pending batched transactions even though the batch interval has elapsed.
221-231: Verify retry semantics don't cause persistent duplicate rejections.The retry mechanism re-queues failed batches, but nonces were already marked "submitted" at line 193. If batch submission repeatedly fails, new incoming transactions with the same nonce will be rejected until cache entries expire (per TTL). Consider logging the re-queue event to track retry frequency.
Based on past review discussions, this trade-off may be acceptable given the TTL-based expiration, but verify that retry failures are monitored.
| t.txMux.Lock() | ||
| defer t.txMux.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a global lock? can we just put the mutex in eoaActivityMetadata and lock per entry? that would significantly reduce contention in the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just put the mutex in eoaActivityMetadata and lock per entry?
Right now, the cache is of type *expirable.LRU[gethCommon.Address, eoaActivityMetadata], so it complains when I add the mutex to eoaActivityMetadata, with:
call of t.eoaActivityCache.Add copies lock value: requester.eoaActivityMetadata contains sync.MutexThat can be solved by changing the cache type to *eoaActivityMetadata instead of simply eoaActivityMetadata.
that would significantly reduce contention in the API.
Looking at the response times of eth_sendRawTransaction on Grafana, it certainly does now. Average response time is about 130ms, which is pretty fast.
do we need a global lock?
It seems to be necessary, in order to tackle the case of concurrent requests from the same EOA, with the same nonce. This is a case that I still see happening on MN & TN.
But I think it's worth investigating your suggestion of having the mutex in eoaActivityMetadata and lock per entry. This will only slow down requests from the same sender, and it will not affect normal users. The global lock that I have now seems that it will affect all users regardless.
| defer cancel() | ||
|
|
||
| // build & submit transaction | ||
| go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to run this in a separate gorountine? It looks like it's blocking anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's definitely blocking. The problem lies in the fact that it will run until it succeeds, or until it fails with:
client: rpc error: code = Canceled desc = context canceledand in some cases this may take any number of seconds, e.g. 10 seconds.
For some reason, it does not respect the context.WithTimeout(ctx, time.Second*3), for any duration value.
It runs for as long as it needs, and simply errors out with:
client: rpc error: code = DeadlineExceeded desc = context deadline exceededI tested this out locally, by adding:
time.Sleep(time.Second * 10)on the SendTransaction method of flow-go-sdk.
As a work-around to this, I let it run in a separate goroutine, and if <-ctx.Done() happens first, it means that the configured timeout for the ctx has already passed, and we can abort the execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see. in that case, please add some comments explaining why it's needed. it wasn't obvious to me.
| return err | ||
| // This method is called while holding the `t.txMux` lock, | ||
| // don't let it run for a long time, to avoid lock-contention | ||
| ctx, cancel := context.WithTimeout(ctx, time.Second*3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having a timeout is a good idea, and I think we could reduce contention by changing the mutex's scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should use a lower timeout duration here? It makes use of SendTransaction & GetAccountKeyAtLatestBlock AN gRPC endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the AN uses 3s for its request to LNs, so you want it to be at least 3
Closes: #922
Description
To avoid the noise from all the invalid transactions from trading bots, we keep track of the last 15 submitted transaction hashes, on a 10 second interval, with the aim to reject incoming transactions that match any of the tracked hashes.
For contributor use:
masterbranchFiles changedin the Github PR explorerSummary by CodeRabbit
Bug Fixes
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.