feat(levm): parallel block execution via BAL by edg-l · Pull Request #6233 · lambdaclass/ethrex

edg-l · 2026-02-20T11:44:50Z

Summary

Implements BAL-based parallel transaction execution for Amsterdam+ blocks using EIP-7928 Block Access Lists.

Approach: BAL State Seeding

Each transaction runs independently on its own database, pre-seeded with BAL-derived intermediate state (same approach as geth). No conflict detection or grouping needed — the BAL provides the complete state dependency graph.

Pipeline (3 concurrent threads):

Warmer — prefetches all account states, storage slots, and contract codes listed in the BAL (parallel batch fetch)
Executor — runs all txs in parallel via rayon, each with its own GeneralizedDatabase seeded from BAL intermediate values
Merkleizer — computes state trie from bal_to_account_updates (BAL → AccountUpdates, no execution needed)

Key functions (crates/vm/backends/levm/mod.rs):

execute_block_parallel — orchestrates the parallel path: sends BAL-derived AccountUpdates to merkleizer, then executes all txs in parallel
bal_to_account_updates(bal, store) — converts BAL final values into Vec<AccountUpdate> for the merkleizer (last entry per field = post-block state)
seed_db_from_bal(db, bal, max_idx) — pre-seeds a per-tx DB with cumulative BAL state through index max_idx (system calls + previous txs)
warm_block_from_bal(bal, store) — 3-phase prefetch: accounts → storage slots → contract codes

BAL indexing: 0 = system calls, 1 = tx 0, 2 = tx 1, ..., N+1 = withdrawals. For tx at index i, seed_db_from_bal applies all changes with block_access_index <= i.

The parallel path is only triggered when header_bal is Some (Amsterdam+ blocks via engine_newPayloadV4). All other callers pass None and use the existing sequential loop unchanged.

EF Test Verification (Two-Pass Parallel Check)

Amsterdam EF tests now exercise the parallel execution path as a correctness check. After the normal sequential run succeeds, a second two-pass run is performed:

Pass 1 (sequential): Re-executes all blocks on a fresh blockchain via add_block_pipeline_returning_bal(block, None), collecting the produced BAL for each block.
Pass 2 (parallel): Re-executes all blocks on another fresh blockchain via add_block_pipeline(block, Some(&bal)), using the BAL from pass 1 to drive the parallel execution path.
Post-state verification: The final state from pass 2 is checked against the expected post-state from the test fixture.

This ensures that the parallel execution path produces identical results to the sequential path across the entire Amsterdam EF test suite. Non-Amsterdam tests are unaffected. All Amsterdam EF tests pass both sequential and parallel execution.

Test plan

6 unit tests for bal_to_account_updates — all pass (cargo test -p ethrex-vm bal_tests)
EF blockchain tests — sequential path (existing CI)
EF blockchain tests — parallel path via two-pass BAL verification (Amsterdam tests)
Hive regression on devnets/bal/2 fixture suite with parallel path enabled
Kurtosis devnet benchmark (see results below)

Notes

No new crate dependencies (rayon and FxHashMap already used)
Per-tx BAL validation: After each transaction executes in parallel, its actual state mutations are validated against the BAL claims. GeneralizedDatabase::compute_tx_diff() computes the diff between initial and final account state for the tx, then BlockAccessList::validate_tx_diff(tx_idx, &diff) checks that balances, nonces, storage values, and code changes match exactly what the BAL declares for that tx index. Blocks with mismatched state mutations are rejected. This matches geth's per-tx validation approach.
BAL recording is disabled in the parallel path — the header-embedded BAL is trusted (validated by validate_block_access_list_hash) and used directly for both state root computation and per-tx state seeding. This mirrors geth's approach: BALStateTransition.IntermediateRoot() computes the state root entirely from BAL diffs (via readAccountDiff + ModifiedAccounts) without re-executing transactions, and BALReader.getStateObject / initObjFromDiff seeds per-tx state from BAL intermediate values. No BAL re-recording happens during validation in either implementation.
Per-tx DBs skip initial_accounts_state tracking (state transitions come from BAL, not from diffing)

Benchmark Results

Benchmarked on a Kurtosis devnet (bal-devnet-2-ethrex.yaml) comparing parallel execution (via BAL) vs sequential (main), both running on identical hardware. Nodes are in consensus (matching block hashes).

Spamoor Setup (mainnet-like workload, ~700 tx/s target)

Scenario	tx/s	Share	Mainnet Analog
`erc20tx`	280	40%	USDT/USDC token transfers — hot storage (balances, allowances)
`uniswap-swaps`	175	25%	DEX swaps — complex contract calls, hot pool state
`eoatx`	105	15%	Simple ETH transfers
`storagespam`	70	10%	Storage-heavy contract interactions
`erc721tx`	35	5%	NFT mints/transfers
`blobs`	35	5%	L2 sequencer-like blob transactions

Blocks average ~231 txs and ~60 Mgas each.

Results (43 non-empty blocks sampled)

	Parallel (`bal-parallel-exec`)	Sequential (`main`)
Avg Ggas/s	2.00	1.20
Avg exec time	31 ms	51 ms
Peak Ggas/s	2.61	1.40

~67% faster with BAL parallel execution on realistic mainnet-like workloads.

github-actions · 2026-02-20T11:47:07Z

Lines of code report

Total lines added: 876
Total lines removed: 0
Total lines changed: 876

Detailed view

+-------------------------------------------------+-------+------+
| File                                            | Lines | Diff |
+-------------------------------------------------+-------+------+
| ethrex/crates/blockchain/blockchain.rs          | 2188  | +24  |
+-------------------------------------------------+-------+------+
| ethrex/crates/common/types/block_access_list.rs | 889   | +82  |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/backends/levm/mod.rs           | 1503  | +674 |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/backends/mod.rs                | 195   | +8   |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/levm/src/db/gen_db.rs          | 545   | +42  |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/levm/src/db/mod.rs             | 148   | +46  |
+-------------------------------------------------+-------+------+

github-actions · 2026-02-20T13:01:59Z

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	2.959 ± 0.019	2.938	2.991	1.08 ± 0.01
`main_levm_BubbleSort`	2.729 ± 0.020	2.708	2.769	1.00
`pr_revm_BubbleSort`	3.081 ± 0.025	3.053	3.138	1.13 ± 0.01
`pr_levm_BubbleSort`	2.759 ± 0.039	2.703	2.849	1.01 ± 0.02

Benchmark Results: ERC20Approval

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Approval`	999.6 ± 10.4	988.0	1018.7	1.00
`main_levm_ERC20Approval`	1043.0 ± 26.8	1026.5	1108.2	1.04 ± 0.03
`pr_revm_ERC20Approval`	1016.1 ± 10.0	999.9	1029.6	1.02 ± 0.01
`pr_levm_ERC20Approval`	1047.9 ± 10.6	1032.7	1067.7	1.05 ± 0.02

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	132.0 ± 1.3	130.5	134.0	1.00
`main_levm_ERC20Mint`	160.2 ± 5.3	156.5	174.4	1.21 ± 0.04
`pr_revm_ERC20Mint`	134.5 ± 1.4	132.6	136.3	1.02 ± 0.01
`pr_levm_ERC20Mint`	161.0 ± 2.7	158.8	168.2	1.22 ± 0.02

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	235.5 ± 2.5	232.7	241.2	1.00
`main_levm_ERC20Transfer`	270.8 ± 3.1	266.8	275.7	1.15 ± 0.02
`pr_revm_ERC20Transfer`	241.0 ± 4.8	236.7	252.3	1.02 ± 0.02
`pr_levm_ERC20Transfer`	271.9 ± 1.9	268.4	275.2	1.15 ± 0.01

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	229.8 ± 1.3	228.4	233.0	1.00
`main_levm_Factorial`	249.5 ± 5.4	245.2	261.6	1.09 ± 0.02
`pr_revm_Factorial`	230.6 ± 1.5	229.3	234.6	1.00 ± 0.01
`pr_levm_Factorial`	250.6 ± 3.6	246.3	256.9	1.09 ± 0.02

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.732 ± 0.034	1.651	1.768	1.00
`main_levm_FactorialRecursive`	9.609 ± 0.026	9.564	9.651	5.55 ± 0.11
`pr_revm_FactorialRecursive`	1.752 ± 0.039	1.706	1.814	1.01 ± 0.03
`pr_levm_FactorialRecursive`	9.631 ± 0.031	9.592	9.699	5.56 ± 0.11

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	206.2 ± 8.1	201.9	229.1	1.00 ± 0.04
`main_levm_Fibonacci`	227.3 ± 8.9	220.1	243.1	1.11 ± 0.04
`pr_revm_Fibonacci`	205.7 ± 1.8	204.3	209.6	1.00
`pr_levm_Fibonacci`	230.9 ± 5.2	226.3	241.0	1.12 ± 0.03

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	904.2 ± 12.6	891.2	924.5	1.30 ± 0.03
`main_levm_FibonacciRecursive`	693.6 ± 11.2	676.4	711.2	1.00
`pr_revm_FibonacciRecursive`	919.1 ± 9.6	909.6	938.1	1.33 ± 0.03
`pr_levm_FibonacciRecursive`	700.0 ± 12.6	684.2	727.8	1.01 ± 0.02

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.6 ± 0.1	8.5	8.8	1.00
`main_levm_ManyHashes`	9.8 ± 0.1	9.7	9.9	1.13 ± 0.01
`pr_revm_ManyHashes`	8.7 ± 0.0	8.6	8.8	1.01 ± 0.01
`pr_levm_ManyHashes`	9.9 ± 0.1	9.8	10.1	1.14 ± 0.02

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	255.9 ± 5.5	252.6	269.9	1.16 ± 0.04
`main_levm_MstoreBench`	222.3 ± 3.8	217.5	229.0	1.01 ± 0.03
`pr_revm_MstoreBench`	254.4 ± 1.5	252.6	257.9	1.15 ± 0.03
`pr_levm_MstoreBench`	220.9 ± 4.9	216.9	232.8	1.00

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	289.4 ± 1.8	287.9	292.8	1.06 ± 0.03
`main_levm_Push`	273.2 ± 7.0	269.9	293.1	1.00
`pr_revm_Push`	290.9 ± 0.9	289.3	292.1	1.06 ± 0.03
`pr_levm_Push`	277.2 ± 14.3	269.3	317.0	1.01 ± 0.06

Benchmark Results: SstoreBench_no_opt

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_SstoreBench_no_opt`	165.4 ± 2.8	163.0	171.0	1.53 ± 0.03
`main_levm_SstoreBench_no_opt`	107.9 ± 0.5	107.1	108.6	1.00 ± 0.01
`pr_revm_SstoreBench_no_opt`	163.3 ± 4.6	157.3	168.9	1.51 ± 0.04
`pr_levm_SstoreBench_no_opt`	107.9 ± 0.5	107.1	108.6	1.00

Implement Phase 1 parallel transaction execution using EIP-7928 Block Access List (BAL) write sets to detect conflicts and assign txs to parallel execution groups. - Add `build_parallel_groups`: builds conflict groups from BAL write sets. Conflicting txs are serialized in the same group; independent txs get separate groups for parallel execution. Same-sender txs are chained into the same group to preserve nonce order. Coinbase is excluded from conflict detection. - Add `execute_block_parallel`: executes groups via rayon, each with its own `GeneralizedDatabase` seeded from post-system-call state. Coinbase fees are accumulated as deltas and applied to the main db after merge. System call updates and merged tx updates are sent to the merkleizer in two batches. - Thread `header_bal: Option<&BlockAccessList>` through `Evm::execute_block_pipeline` and `LEVM::execute_block_pipeline`. When `Some(bal)` is provided (Amsterdam fork, engine API path), the parallel path is taken; otherwise falls back to the existing sequential loop. - Add 10 unit tests for `build_parallel_groups` covering: empty block, single tx, same-sender chains, conflicting/non-conflicting pairs, coinbase exclusion, transitive conflict graphs, and mixed scenarios.

When a group contains multiple transactions, get_state_transitions_tx promotes the coinbase balance to initial_accounts_state after each tx. This means subsequent per-tx coinbase AccountUpdates show an accumulated absolute balance, not an incremental delta. Subtracting coinbase_initial_balance from each per-tx update was double-counting fees from earlier txs in the same group, producing a wrong state root. Fix: read the final coinbase balance from initial_accounts_state once per group (after all txs have been drained), and compute a single delta per group instead of summing per-tx deltas.

…equential Write-only conflict detection misses read-after-write (RAW) hazards: if tx_j reads account X without writing it, and tx_i (i < j) writes X, they end up in separate parallel groups — tx_j reads the pre-block value instead of tx_i's write, producing a wrong state root. Fix: - Add `reads: Option<FxHashSet<Address>>` to GeneralizedDatabase, populated in load_account on first access (initial_accounts_state or store). Only enabled in parallel group dbs (None in all other paths, no overhead). - Add execute_txs_sequential helper that runs all txs in order on the main db and returns receipts + merged AccountUpdates. - After parallel execution, check each group's read set against all other groups' write sets. If any intersection is found (RAW conflict), discard the parallel results and re-run sequentially on the main db, which is already in the correct post-system-call state. This is conservative (falls back on any read-write overlap regardless of tx ordering) but guarantees correctness. False positives only mean unnecessary sequential fallback — never a wrong state root.

Replace the address-level W-W greedy grouping + post-hoc sequential fallback with a correct upfront conflict graph using Union-Find. New algorithm in `build_parallel_groups`: - Resource-level (slot-level) write sets from BAL: Balance, Nonce, Code, Storage(addr, slot) - Per-tx read sets approximated from static metadata: sender balance/nonce, call target code/balance, EIP-2930 access list entries - Union-Find for transitive grouping handles same-sender, W-W, and RAW conflicts - RAW: if tx_j reads resource R and any earlier tx_i writes R, union(i,j) - WAR (reader before writer) is safe and not serialized - Coinbase excluded from all conflict detection Remove the rejected sequential fallback (`execute_txs_sequential`) and the post-hoc RAW check that re-ran the entire block sequentially on conflict. Also remove the `reads: Option<FxHashSet<Address>>` field from `GeneralizedDatabase` (was used only by the removed fallback).

Two correctness fixes from code review: 1. Coinbase delta: replace saturating_sub with explicit signed accounting using separate credit/debit U256 accumulators. Previously, if coinbase was a tx sender spending more ETH than it received in fees (rare but valid), saturating_sub(0) silently discarded the negative delta. 2. EIP-7702 authorization list: add Resource::Code(auth.address) to the read set for EIP-7702 txs. The delegate target's code is loaded at call time via the delegation pointer, so if an earlier tx deploys code to that address, this RAW hazard must be detected upfront. The authority address itself cannot be added (requires ecrecover at runtime); W-W detection via the BAL handles the authority code-write case.

The static read set (sender/to/access_list) misses cases where a called contract internally reads a storage slot written by an earlier tx, without declaring it in the EIP-2930 access list. Fix: after building per-tx write sets from the BAL, build a map of address → all written storage slots. When approximating read sets in Phase 2, for any address that tx_j directly accesses (to or access_list), add all of that address's written storage slots to tx_j's read set. This catches the common RAW pattern where tx_j calls a contract whose storage was modified by tx_i (direct-call case). Multi-hop internal calls through addresses not in tx metadata remain an inherent limitation.

Any CALL transaction may transitively read ANY written storage slot in the block through sub-calls to other contracts. Since we cannot determine the full call graph statically from BAL metadata, we conservatively add ALL block-level written storage slots to the read set of every call transaction. This supersedes the previous per-address approach (adding only written slots of the direct `to` address), which missed multi-hop patterns: tx_i writes Storage(A, s) → tx_j calls B → B calls A → A reads slot s With this fix, tx_j.reads includes Storage(A, s), triggering the RAW union with tx_i and ensuring sequential execution within the group. The conservative approach reduces parallelism: all call txs touching written storage are grouped together. ETH transfers and CREATE txs unaffected by storage writes can still parallelize. Correctness takes priority here; Block-STM or call-graph analysis could recover parallelism later.

… txs The previous test suite only used balance writes and CREATE transactions, so the new conservative multi-hop RAW detection was completely untested. New tests cover: - CALL tx to the same address as a storage writer → same group (W-W + CALL) - CALL tx to a different address than the storage writer → same group (multi-hop RAW) - Three txs with two storage writers and one unrelated CALL → all one group - CREATE txs with disjoint storage writes → still parallel (no CALL branch triggered) - WAR ordering (reader before writer) → no spurious serialization

… sequential fallback The BAL (EIP-7928) only records writes, not reads. Read sets for parallel grouping must be approximated statically, which previously only included written storage slots for CALL txs. This missed RAW conflicts when a contract reads an account balance (BALANCE opcode) or code (EXTCODESIZE/DELEGATECALL) modified by an earlier tx not in the same group. - Extend conservative read set to include written Code and non-sender Balance resources (sender balances excluded to avoid mass serialization since every tx writes Balance(sender) via gas fees) - Add sequential execution fallback in add_block_pipeline: if parallel produces a gas/receipts/state mismatch, retry with a fresh VM without BAL to guarantee correctness for any remaining edge cases

Instead of deep-cloning the post-system-call CacheDB into each parallel group, wrap it in Arc and add a shared_base field to GeneralizedDatabase. Accounts are lazily cloned into initial_accounts_state on first access, making get_state_transitions_tx transparent to the change.

- Binary search (partition_point) in seed_db_from_bal instead of reverse linear scan - Batch prefetch_accounts/prefetch_storage on CachingDatabase with parallel inner fetch + single write-lock - mem::take for system_seed to avoid cloning initial_accounts_state - Cache chain_config in CachingDatabase via OnceLock - Add rayon to ethrex-levm for parallel batch prefetch - Add bal-devnet-2-light and bal-devnet-2-ethrex kurtosis fixtures - Update ethereum-package revision

Skip initial_accounts_state cloning in parallel per-tx DBs (never diffed), consolidate HashMap lookups in seed_db_from_bal, batch prefetch in bal_to_account_updates, eliminate intermediate Vec allocations, and streamline warm_block_from_bal code prefetch.

Validate each parallel tx's execution results against the header BAL claims, rejecting blocks with mismatched state mutations. Matches geth's validation approach.

Amsterdam EF tests now exercise the parallel execution path as a correctness check. After the normal sequential run succeeds, a second two-pass run is performed: pass 1 re-executes sequentially to collect the produced BAL, pass 2 re-executes on a fresh blockchain using that BAL to drive the parallel code path, then verifies the post-state matches. Also threads the produced BAL through BlockExecutionPipelineResult and adds add_block_pipeline_returning_bal to Blockchain.

- Extract add_block_pipeline_inner to deduplicate add_block_pipeline and add_block_pipeline_bal - Use binary search (partition_point) for storage slot lookup in validate_tx_execution - Rename _db to db in execute_block_parallel - Add clarifying comments for stack_pool capacity, any_storage heuristic, and has_storage safety

- Remove blanket #![allow(dead_code)] from block_access_list.rs - Downgrade [PARALLEL] log from info! to debug! to avoid flooding production logs - Add doc comment explaining nested Result semantics in add_block_pipeline_inner

github-actions · 2026-02-25T14:14:43Z

🤖 Kimi Code Review

⚠️ Review failed: Kimi API request failed with status 429

Automated review by Kimi (Moonshot AI)

greptile-apps · 2026-02-25T14:17:21Z

Greptile Summary

This PR implements BAL-based parallel transaction execution for Amsterdam+ blocks, achieving a 67% performance improvement (2.0 Ggas/s vs 1.2 Ggas/s) on realistic workloads.

Key Changes

Parallel execution pipeline (execute_block_parallel): Each tx runs independently on its own database seeded with BAL-derived intermediate state, matching geth's approach
Three-phase BAL warming (warm_block_from_bal): Prefetches accounts → storage slots → contract codes in parallel batches
Per-tx validation (validate_tx_execution): Post-execution state is verified against BAL claims using a pre-built index for O(1) lookups
State root computation (bal_to_account_updates): Merkleizer receives state directly from BAL without re-executing transactions
Two-pass EF test verification: Amsterdam tests run both sequential and parallel paths to ensure identical results

Implementation Quality

The implementation is well-architected with:

Clean separation between sequential and parallel paths (no BAL = sequential, Some(BAL) = parallel)
Proper error handling with non-fatal warming failures logged but not blocking
Efficient data structures (FxHashMap, pre-sized allocations, binary search on sorted BAL entries)
Comprehensive unit tests (6 tests for bal_to_account_updates)
Production validation via Hive regression and Kurtosis devnet benchmarks

Architecture Highlights

The parallel path uses embarrassingly parallel execution via rayon with no conflict detection needed - the BAL provides complete state dependencies. Each transaction gets its own GeneralizedDatabase instance with:

Shared read-only base state (post-system-call snapshot)
BAL-seeded intermediate values for previous txs
Skip of initial_accounts_state tracking (not needed since state comes from BAL)

All Amsterdam EF tests pass with both sequential and parallel execution producing identical post-state.

Confidence Score: 4/5

Safe to merge with high confidence - comprehensive testing and production validation demonstrate correctness
Score of 4 reflects solid implementation with extensive testing (EF tests, Hive, Kurtosis benchmarks) and clean architecture following geth's proven approach. Minor score reduction due to complexity of parallel execution and BAL validation logic, though validation is thorough. The two-pass test verification for Amsterdam blocks provides strong correctness guarantees.
No files require special attention - implementation is well-structured with appropriate error handling and validation throughout

Important Files Changed

Filename	Overview
crates/vm/backends/levm/mod.rs	Implements BAL-based parallel execution pipeline with state seeding, validation, and warming. Core logic appears sound with proper tx-level validation.
crates/common/types/block_access_list.rs	Adds validation index and binary search helpers for efficient BAL lookups. Clean implementation with no issues found.
crates/vm/levm/src/db/gen_db.rs	Extends GeneralizedDatabase with shared base state support and skip_initial_tracking flag for parallel execution. Well-designed for the parallel use case.
crates/blockchain/blockchain.rs	Adds BAL parameter threading and warming failure logging. Clean integration with existing pipeline.
tooling/ef_tests/blockchain/test_runner.rs	Implements two-pass parallel testing for Amsterdam blocks. Excellent correctness verification approach.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Block arrives with BAL] --> B{Amsterdam+ block?}
    B -->|No| C[Sequential execution path]
    B -->|Yes| D[Parallel execution path]
    
    D --> E[Thread 1: Warmer<br/>warm_block_from_bal]
    D --> F[Thread 2: Executor<br/>execute_block_parallel]
    D --> G[Thread 3: Merkleizer]
    
    E --> E1[Phase 1: Prefetch accounts]
    E1 --> E2[Phase 2: Prefetch storage slots]
    E2 --> E3[Phase 3: Prefetch contract codes]
    
    F --> F1[bal_to_account_updates<br/>Convert BAL to state updates]
    F1 --> F2[Send updates to merkleizer]
    F2 --> F3[Execute txs in parallel via rayon]
    
    F3 --> F4[For each tx:<br/>1. seed_db_from_bal<br/>2. execute_tx_in_block<br/>3. validate_tx_execution]
    
    F4 --> F5{Validation passed?}
    F5 -->|No| H[Reject block]
    F5 -->|Yes| I[Build receipts]
    
    G --> G1[Compute state root from updates]
    
    I --> J[Process withdrawals]
    J --> K[Extract requests]
    K --> L[Return BlockExecutionResult]

_{Last reviewed commit: 81de430}

github-actions · 2026-02-25T14:18:25Z

🤖 Codex Code Review

Findings

Consensus validation gap: BAL hash not checked when header BAL is provided
In execute_block_pipeline, BAL hash validation only runs when a produced_bal exists, but the parallel path returns None, so the header-provided BAL is never validated against the header hash. This allows a peer-supplied BAL to drive execution/merkleization without proving it matches block_access_list_hash.
File: crates/blockchain/blockchain.rs:443-461
Suggestion: validate header_bal (when present) against the header hash in the execution thread, or validate before spawning execution.
Fork-guard missing: BAL parallel path can run pre-Amsterdam
The parallel BAL path is selected purely by header_bal.is_some(); it does not check is_amsterdam. If a caller mistakenly supplies a BAL for a pre-Amsterdam block, execution and merkleization will be driven by BAL data, which is a consensus violation.
File: crates/vm/backends/levm/mod.rs:191-248
Suggestion: gate the BAL path with is_amsterdam, and either ignore header_bal pre-fork or return an error.
Potential spec gap: accounts missing from BAL are silently skipped
In BAL validation, any modified account not present in the BAL is skipped, with the comment that the state root will catch discrepancies. This is true only if final state differs; if a tx touches an account but leaves it unchanged, the BAL might still be invalid under EIP-7928 requirements, yet would pass validation.
File: crates/vm/backends/levm/mod.rs:861-872
Suggestion: consider stricter handling for “modified but not in BAL” (e.g., verify against store/shared_base or treat as error if account.status indicates mutation beyond a read).

Performance note

seed_db_from_bal iterates all BAL accounts for every tx (O(N_accounts * N_txs)). This is likely a hotspot for large blocks.
File: crates/vm/backends/levm/mod.rs:501-632
Suggestion: consider precomputing per-account prefix values or per-tx slices to reduce repeated scans.

If you address Items 1–2, the rest looks structurally solid. The parallel BAL execution path is a thoughtful optimization and the validation index design is a good direction.

Automated review by OpenAI Codex · custom prompt

github-actions · 2026-02-25T15:39:13Z

🤖 Claude Code Review

Now I have enough context to write a comprehensive review. Let me compile the findings.

Code Review: PR #6233 — feat(levm): parallel block execution via BAL

This is a well-structured and significant feature that implements EIP-7928 BAL-based parallel execution. The overall design mirrors geth's approach correctly. Below are findings ordered by severity.

Critical / High

1. rayon added unconditionally to ethrex-levm, breaking ZK guest environments

crates/vm/levm/Cargo.toml adds rayon as an unconditional dependency, and this is reflected in the Cargo.lock changes for risc0, sp1, openvm, and zisk guest programs. Rayon spawns OS threads, which are unavailable in ZK prover environments (they're typically no-std or single-threaded). This will either fail to compile or silently fall back incorrectly depending on each backend's threading model.

The parallel execution code should be feature-gated (e.g., behind a parallel or std feature), and the rayon dependency should be conditional on that feature. If these guest programs already have a mechanism for disabling threading (e.g., a zk feature), this should use that.

2. Silent validation gap in validate_tx_execution Part B when seeded_pos == 0

In crates/vm/backends/levm/mod.rs, Part B validates that execution didn't modify accounts beyond what BAL claims. However, when an account was loaded from the store/shared_base (not from a BAL entry), and execution modifies it, the check is explicitly skipped with a comment:

// If seeded_pos == 0, balance was never seeded (loaded from store/shared_base).
// We can't cheaply verify without store access. Skip.

This means a tx that incorrectly mutates an account whose initial state came from the store (e.g., an account with no prior BAL changes in the block) would pass per-tx validation. The state root would catch a true global discrepancy, but that check happens AFTER the BAL-derived account_updates have already been sent to the merkleizer — so by the time the root mismatch is detected, the block is validated against the wrong state.

Consider snapshotting the pre-seed account state into the per-tx DB so the comparison can be done without a store lookup, or validate the balance/nonce against the store value when seeded_pos == 0.

Medium

3. tx_idx as u16 truncation with no bounds check

In execute_block_parallel and execute_block_pipeline:

#[allow(clippy::cast_possible_truncation)]
Self::seed_db_from_bal(&mut tx_db, bal, tx_idx as u16)?;

This silently wraps for blocks with more than 65,535 transactions, producing an incorrect max_idx that would cause the wrong BAL entries to be applied. EIP-7928 should define a maximum, but a defensive assert!(tx_idx <= u16::MAX as usize) or an explicit error return would prevent subtle corruption if a malformed BAL is accepted.

4. Code::from_bytecode(seeded_code.clone()).hash in per-tx validation hot path

In validate_tx_execution Part B, validating whether execution changed an account's code recomputes the keccak256 hash of the entire bytecode:

let seeded_hash = if seeded_code.is_empty() {
    *EMPTY_KECCACK_HASH
} else {
    Code::from_bytecode(seeded_code.clone()).hash
};

This runs in the parallel validation loop for every modified account in every tx that doesn't have a code change at bal_idx. For large contracts (several KB of bytecode), this is a non-trivial allocation + hash per tx. The hash should be derivable from the BAL's CodeChange without recomputing it if Code caches the hash, or the comparison should be hash-only (comparing account.info.code_hash against the precomputed hash from the BAL entry, which can be computed once in build_validation_index).

5. has_storage: false for newly seeded accounts may silently skip storage trie reads

In seed_db_from_bal, when all account info fields are covered by the BAL (has_all_info == true), a new LevmAccount is inserted with has_storage: false:

let acc = db.current_accounts_state.entry(addr).or_insert_with(|| LevmAccount {
    info: AccountInfo::default(),
    storage: FxHashMap::default(),
    has_storage: false,    // <-- even if the account has on-chain storage
    status: AccountStatus::Modified,
});

The comment warns against reuse but doesn't address whether has_storage is consulted within the parallel execution itself. If any EVM opcode handler (e.g., SLOAD for a slot not yet seeded by BAL) checks has_storage to decide whether to fall back to the storage trie, it would incorrectly conclude "no storage" and return zero. This is distinct from the BAL-seeded slots (which are in acc.storage); it applies to slots accessed at runtime that weren't in the BAL's read or write list. If such a slot exists, the EVM would see 0 instead of the on-chain value.

6. add_block_pipeline_inner returns a nested Result

fn add_block_pipeline_inner(
    &self,
    block: Block,
    bal: Option<&BlockAccessList>,
) -> Result<(Option<BlockAccessList>, Result<(), ChainError>), ChainError>

The outer Ok can wrap an inner Err. This is unusual Rust — callers must double-unwrap, and it's easy to accidentally check only the outer Result. The motivation (returning the BAL even on storage failure) is understandable, but consider restructuring so the BAL is extracted before the storage step, or use a dedicated struct rather than nested Results.

Low / Nits

7. Duplicate has_writes computation in bal_to_account_updates

The has_writes predicate is evaluated twice per account: once as a .filter() over write_addrs, and again inside the for acct_changes in bal.accounts() loop. The filter's result (the set of accounts with writes) could be reused for the loop to avoid redundancy:

let write_addrs: Vec<Address> = bal.accounts().iter()
    .filter(|ac| { /* has_writes logic */ })
    .map(|ac| ac.address)
    .collect();
// ... prefetch ...
for acct_changes in bal.accounts().iter().filter(|ac| { /* duplicated */ }) { ... }

8. capacity / 4 for code map pre-allocation truncates to 0 for small inputs

In new_with_shared_base_and_capacity:

codes: FxHashMap::with_capacity_and_hasher(capacity / 4, Default::default()),

Integer division truncates: for capacity < 4, this allocates zero slots. Not a correctness issue, but a no-op pre-allocation.

9. Silent code prefetch errors in warm_block_from_bal Phase 3

code_hashes.par_iter().for_each(|&h| {
    let _ = store.get_account_code(h);
});

Errors are silently dropped. Other warming phases use map_err(|e| EvmError::...)?. Consider at least a debug! log on error here, consistent with the pattern established in blockchain.rs for warming failures.

Positive Observations

The BAL indexing semantics (0 = system calls, 1 = tx_0, ...) are consistently applied throughout seed_db_from_bal and validate_tx_execution, and are well-documented.
The build_validation_index() precomputation (one FxHashMap per block, shared read-only) is a good design choice — avoids per-tx linear scans of the BAL.
The two-pass EF test harness (run_two_pass_parallel) is a solid correctness check that exercises the parallel path against the sequential path's output without modifying existing test infrastructure.
prefetch_accounts/prefetch_storage defaulting to sequential fallback in the Database trait ensures backward compatibility for non-caching backends.
The 67% throughput improvement on realistic workloads is a strong result that justifies the complexity.

Automated review by Claude (Anthropic) · custom prompt

…l path After merging main (which added the disable_balance_check param in #6259), the BAL parallel execution call site was missing the argument. Normal execution uses false (balance checks enabled).

github-actions bot assigned edg-l Feb 20, 2026

github-actions bot added the levm Lambda EVM implementation label Feb 20, 2026

github-project-automation bot added this to ethrex_l1 Feb 20, 2026

edg-l changed the base branch from main to bal-optimizations February 20, 2026 11:57

edg-l moved this to In Progress in ethrex_l1 Feb 20, 2026

lambdaclass deleted a comment from github-actions bot Feb 20, 2026

edg-l force-pushed the bal-parallel-exec branch from d1597f5 to 5633b93 Compare February 23, 2026 09:12

Base automatically changed from bal-optimizations to main February 23, 2026 16:01

edg-l force-pushed the bal-parallel-exec branch from a87b72e to d2f646e Compare February 24, 2026 07:50

edg-l changed the title ~~feat(levm): parallel block execution via BAL dependency graph~~ feat(levm): parallel block execution via BAL Feb 24, 2026

edg-l force-pushed the bal-parallel-exec branch from 2f8ab0b to 7a4cd2d Compare February 24, 2026 15:17

edg-l added 15 commits February 24, 2026 17:21

bal-devnet-3: BAL-based batched state reads for block prewarming

ccd058d

fixes

4f5773d

revert weight=0 for removed accounts

b06e6ac

fmt

81aa442

log

d3a1d7e

edg-l added 4 commits February 24, 2026 17:21

status

d00b8fc

spamoor

d4808c2

edg-l force-pushed the bal-parallel-exec branch from 7a4cd2d to d4808c2 Compare February 24, 2026 16:21

edg-l added 14 commits February 24, 2026 17:23

format

d57ff0d

lockfile

cd9f640

fixes

aa79076

feat(vm): per-tx BAL validation for parallel execution

b0134e9

Validate each parallel tx's execution results against the header BAL claims, rejecting blocks with mismatched state mutations. Matches geth's validation approach.

Merge branch 'main' into bal-parallel-exec

3824205

remove md

a609e1a

lock

6376e22

lock

8bef3fd

optimizations

297a775

improve

4920893

refactor: review cleanup for parallel execution PR

1755f49

- Remove blanket #![allow(dead_code)] from block_access_list.rs - Downgrade [PARALLEL] log from info! to debug! to avoid flooding production logs - Add doc comment explaining nested Result semantics in add_block_pipeline_inner

refactor: log warming failures instead of silently discarding

81de430

edg-l marked this pull request as ready for review February 25, 2026 14:14

edg-l requested review from a team, ManuelBilbao, avilagaston9 and ilitteri as code owners February 25, 2026 14:14

edg-l added 2 commits February 26, 2026 11:44

Merge remote-tracking branch 'origin/main' into bal-parallel-exec

4b23cbe

fix: pass disable_balance_check to execute_tx_in_block in BAL paralle…

69dadae

…l path After merging main (which added the disable_balance_check param in #6259), the BAL parallel execution call site was missing the argument. Normal execution uses false (balance checks enabled).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(levm): parallel block execution via BAL #6233

feat(levm): parallel block execution via BAL #6233
edg-l wants to merge 41 commits intomainfrom
bal-parallel-exec

edg-l commented Feb 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 20, 2026 •

edited

Loading

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Benchmark Results: SstoreBench_no_opt

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

greptile-apps bot commented Feb 25, 2026

Important Files Changed

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

edg-l commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach: BAL State Seeding

EF Test Verification (Two-Pass Parallel Check)

Test plan

Notes

Benchmark Results

Spamoor Setup (mainnet-like workload, ~700 tx/s target)

Results (43 non-empty blocks sampled)

Uh oh!

github-actions bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results Comparison

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Benchmark Results: SstoreBench_no_opt

Uh oh!

github-actions bot commented Feb 25, 2026

🤖 Kimi Code Review

Uh oh!

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Key Changes

Implementation Quality

Architecture Highlights

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

github-actions bot commented Feb 25, 2026

🤖 Codex Code Review

Uh oh!

github-actions bot commented Feb 25, 2026

🤖 Claude Code Review

Code Review: PR #6233 — feat(levm): parallel block execution via BAL

Critical / High

Medium

Low / Nits

Positive Observations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

edg-l commented Feb 20, 2026 •

edited

Loading

github-actions bot commented Feb 20, 2026 •

edited

Loading

github-actions bot commented Feb 20, 2026 •

edited

Loading