Skip to content

Conversation

@avilagaston9
Copy link
Contributor

@avilagaston9 avilagaston9 commented Feb 10, 2026

Motivation

The block building loop (build_payload_loop) continuously rebuilds the block for 12 seconds (one slot), even when no new transactions have arrived. Each rebuild is CPU-intensive (EVM execution, trie operations) and contends with RPC handlers over the shared trie_cache mutex, starving the RPC server.

Description

Add an AtomicU64 generation counter to Mempool that increments on every transaction insertion. The payload build loop polls this counter with a 100ms sleep interval and only rebuilds when it detects a change, instead of looping as fast as possible.

  • Mempool::add_transaction() increments the generation counter after inserting a tx
  • build_payload_loop compares the current generation to the one from the last build; if unchanged, sleeps 100ms and rechecks
  • The first build_payload call remains unconditional — we always build one payload immediately when FCU arrives
  • No new tokio features required — uses only std::sync::atomic and the already-enabled tokio::time

Checklist

  • Updated STORE_SCHEMA_VERSION (crates/storage/lib.rs) if the PR includes breaking changes to the Store requiring a re-sync.

new transactions instead of rebuilding continuously. Each rebuild is
CPU-intensive (EVM execution, trie operations) and contends with RPC
handlers over the shared trie_cache mutex, starving the RPC server.

Mempool::add_transaction() now calls notify_one() after inserting a tx.
build_payload_loop uses tokio::select! to wait for either a mempool
notification or cancel_token cancellation before each rebuild. The first
build_payload call remains unconditional.
@github-actions github-actions bot added L1 Ethereum client performance Block execution throughput and performance in general labels Feb 10, 2026
@github-actions
Copy link

🤖 Kimi Code Review

Security & Correctness Issues

File: crates/blockchain/payload.rs

  • Line 393-400: The tokio::select! usage introduces a race condition. If transactions are added to the mempool between the build_payload() call (line 389) and the select statement, the notification will be missed, potentially causing the builder to wait unnecessarily. This could lead to empty blocks when there are actually transactions available.

  • Line 396: The cancel_token.cancelled() branch uses break which exits the while loop entirely, but the code continues execution to line 402. This means the function will return the last built payload even when cancellation is requested, which may not be the intended behavior.

Performance & Correctness Issues

File: crates/blockchain/mempool.rs

  • Line 132-134: The drop(inner) is unnecessary - Rust's drop order will automatically drop inner when it goes out of scope at the end of the function. This explicit drop adds no value and slightly reduces readability.

File: crates/blockchain/payload.rs

  • Line 393-400: The current logic rebuilds the payload immediately after receiving a notification, but doesn't check if there are actually any new transactions. This could lead to unnecessary work if the notification was spurious or if the new transactions were already included in the previous build.

Suggested Fixes

  1. Fix the race condition in payload.rs:
// Before the while loop, check if we should wait
if res.transactions.is_empty() {
    tokio::select! {
        _ = self.mempool.tx_added().notified() => {}
        _ = cancel_token.cancelled() => return Ok(res),
    }
}
  1. Handle cancellation properly:
while start.elapsed() < SECONDS_PER_SLOT && !cancel_token.is_cancelled() {
    // Check for new transactions or cancellation
    tokio::select! {
        _ = self.mempool.tx_added().notified() => {}
        _ = cancel_token.cancelled() => return Ok(res),
    }
    
    if cancel_token.is_cancelled() {
        return Ok(res);
    }
    
    // ... rest of the logic
}
  1. Remove unnecessary drop in mempool.rs (line 132-134).

Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link

🤖 Codex Code Review

Findings

  • High: payload loop can block past slot deadline
    In crates/blockchain/payload.rs:392 the new tokio::select! waits indefinitely for tx_added().notified() or cancellation. Because the while start.elapsed() < SECONDS_PER_SLOT check happens before the await, the loop can now sleep past the slot deadline if no new tx arrives and no cancellation happens. This is a behavioral change from “rebuild until slot time” to “wait until tx/cancel”, which can produce late payload updates.
    Suggestion: add a timeout based on remaining slot time, e.g. tokio::time::timeout(remaining, notified()).await or a sleep(remaining) branch in the select! so the loop exits on time.

  • Medium: potential missed wakeups for existing txs
    In crates/blockchain/payload.rs:392, if the mempool already has new transactions before the first notified() is awaited, this loop will still block until the next insert. If the intent is “rebuild whenever there are pending txs”, consider checking mempool state before waiting, or use a periodic short sleep in the select! to avoid indefinite idle when there are already txs.

  • Low: notify_one may under-wake if multiple consumers are added later
    In crates/blockchain/mempool.rs:133, notify_one is used after insert. If multiple tasks might wait on tx_added in the future (e.g., multiple payload builders), they could starve. If you want all waiters to re-check on every insert, notify_waiters() is safer. If single-consumer is intended, document it.

Notes

  • The explicit drop(inner) before notifying in crates/blockchain/mempool.rs:131 is good for avoiding wakeups under the write lock.

If you want, I can sketch a minimal patch for the timeout/remaining-slot handling in payload.rs.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link

🤖 Claude Code Review

Perfect! Now I have a complete understanding of the PR. Let me compile my review:

Code Review - PR #6165: perf(l1): only rebuild payload when mempool receives new transactions

Summary

This PR adds a tokio::sync::Notify mechanism to prevent unnecessary continuous block rebuilding during the 12-second payload build window. The change is well-motivated and addresses a real performance bottleneck.


✅ Strengths

  1. Well-designed solution: Using tokio::sync::Notify is the idiomatic approach for this use case
  2. Proper lock management: Explicitly dropping the write lock before calling notify_one() (mempool.rs:133) prevents holding locks while waking waiters
  3. Correct tokio::select! usage: The implementation properly handles both mempool notifications and cancellation tokens (payload.rs:394-397)
  4. Preserves initial build: The first build_payload call remains unconditional (payload.rs:391), ensuring we always have an initial payload

🔍 Observations & Potential Issues

1. Race condition potential (Low severity)

Location: mempool.rs:129-134 and payload.rs:394-395

There's a subtle timing window: if notify_one() is called between when the build loop completes a build iteration and before it calls notified().await, the notification could be "consumed" by a stale permit, causing the next transaction to not trigger an immediate rebuild.

However, tokio::sync::Notify is designed to handle this: notify_one() stores a permit that remains available until notified().await is called, so this should work correctly. No change needed, but worth documenting.

2. Missing notification on transaction removal (Low severity)

Location: mempool.rs:178-182

The remove_transaction() and remove_oldest_transaction() methods don't trigger notifications. While this is probably fine (removing transactions shouldn't trigger rebuilds), consider whether block building should be aware when the mempool becomes empty or significantly smaller.

Recommendation: No action needed unless there's a use case for rebuilding when transactions are removed.

3. Cargo.toml feature additions

Location: crates/blockchain/Cargo.toml:25

-tokio = { workspace = true, features = ["time", "rt"] }
+tokio = { workspace = true, features = ["time", "rt", "sync", "macros"] }
  • "sync" is needed for tokio::sync::Notify
  • ⚠️ "macros" is needed for tokio::select! - confirm this macro isn't already available via workspace dependencies

Action: Verify that "macros" isn't redundant with workspace-level tokio features.

4. Potential starvation scenario (Low-Medium severity)

Location: payload.rs:392-415

If transactions arrive continuously at a high rate, the build loop might rebuild frequently, potentially never reaching the optimal state. Each notify_one() immediately wakes the loop, which then spawns a blocking task (payload.rs:400-401).

Scenario:

  • Transaction arrives → notify → rebuild starts
  • During rebuild, 10 more transactions arrive → 10 notifications queued
  • After rebuild completes, loop immediately rebuilds again (only consumes 1 permit)

The current implementation handles this correctly: Notify stores only a single permit, so multiple rapid notifications coalesce into one rebuild. This is actually good behavior.

No action needed, but the PR description correctly notes this: "if multiple txs arrive during a build, the next notified().await resolves immediately and one rebuild picks up all pending txs".

5. Error path analysis

Location: mempool.rs:109-137

If add_transaction() returns early due to errors (lines 115, 124), the notification is never sent. This is correct behavior since no transaction was actually added.

Correct as-is


🔒 Security & Consensus Implications

  • ✅ No changes to transaction validation, gas accounting, or consensus rules
  • ✅ No changes to RLP encoding, state trie, or EVM execution
  • ✅ The optimization is purely at the block building orchestration level
  • ✅ The initial unconditional build ensures we always produce a valid block

🎯 Rust Best Practices

  1. ✅ Proper use of drop(inner) for explicit lock release
  2. ✅ Correct async/await patterns with tokio::select!
  3. ✅ No unsafe code
  4. Notify is #[derive(Debug)] compatible (mempool.rs:78-82)

📋 Minor Suggestions

Documentation Enhancement

Add a brief comment explaining the notification behavior:

// Drop the write lock before notifying to avoid holding it while waking waiters
drop(inner);
// notify_one() stores a permit; if multiple txs arrive during a build,
// the next notified().await resolves immediately to batch them
self.tx_added.notify_one();

Testing Consideration

Consider adding an integration test that verifies:

  1. No rebuild occurs when mempool is idle
  2. Rebuild is triggered when transaction arrives
  3. Multiple rapid transactions coalesce into minimal rebuilds

✅ Verdict

APPROVE - This is a well-implemented performance optimization that addresses a real issue. The code is correct, follows Rust best practices, and doesn't introduce security concerns.

Minor action items:

  1. Verify "macros" feature in Cargo.toml isn't redundant
  2. Consider adding documentation comment (optional)
  3. Consider adding integration test (optional, good to have)

The implementation correctly handles the tricky aspects of Notify semantics and maintains correctness while improving performance.


Automated review by Claude (Anthropic) · custom prompt

@github-actions
Copy link

github-actions bot commented Feb 10, 2026

Lines of code report

Total lines added: 16
Total lines removed: 0
Total lines changed: 16

Detailed view
+-------------------------------------+-------+------+
| File                                | Lines | Diff |
+-------------------------------------+-------+------+
| ethrex/crates/blockchain/mempool.rs | 387   | +9   |
+-------------------------------------+-------+------+
| ethrex/crates/blockchain/payload.rs | 713   | +7   |
+-------------------------------------+-------+------+

Mempool, polled by the payload build loop with a 100ms sleep interval.
This avoids adding new tokio features (sync, macros) while still
skipping rebuilds when no new transactions have arrived.
@avilagaston9 avilagaston9 changed the title perf(l1): only rebuild payload when mempool receives new transactions fix(l1): only rebuild payload when mempool receives new transactions Feb 10, 2026
@github-actions github-actions bot removed the performance Block execution throughput and performance in general label Feb 10, 2026
iteration always triggers a rebuild without sleeping, picking up any
transactions that arrived during the initial build_payload call.
@github-actions
Copy link

Benchmark Block Execution Results Comparison Against Main

Command Mean [s] Min [s] Max [s] Relative
base 62.991 ± 0.161 62.786 63.345 1.00
head 63.143 ± 0.258 62.736 63.487 1.00 ± 0.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client

Projects

Status: No status
Status: Todo

Development

Successfully merging this pull request may close these issues.

1 participant