Skip to content

feat(levm): proven execution — Hive, Sync, JIT phases 0-8B#6260

Closed
cd4761 wants to merge 48 commits intolambdaclass:mainfrom
tokamak-network:feat/tokamak-proven-execution
Closed

feat(levm): proven execution — Hive, Sync, JIT phases 0-8B#6260
cd4761 wants to merge 48 commits intolambdaclass:mainfrom
tokamak-network:feat/tokamak-proven-execution

Conversation

@cd4761
Copy link

@cd4761 cd4761 commented Feb 25, 2026

Summary

  • Hive test integration: 6 suites (RPC Compat, Devp2p, Engine Auth/Cancun/Paris/Withdrawals) in pr-tokamak.yaml
  • Testnet sync verification: tokamak-sync.yaml (Hoodi/Sepolia, workflow_dispatch)
  • Feature flag safety: Quality Gate checks tokamak, tokamak-jit, tokamak-debugger, tokamak-l2
  • JIT phases 2-8B complete: revmc/LLVM backend, dual-execution validation, benchmarking
  • Roadmap + status tracking documents added

Test plan

  • Quality Gate passes (cargo check/clippy/test with tokamak features)
  • Docker build succeeds with --features tokamak-jit
  • Hive 6 suites pass (RPC Compat, Devp2p, Engine Auth, Cancun, Paris, Withdrawals)
  • JIT Backend build+test with revmc (LLVM 21)
  • Manual: trigger tokamak-sync.yaml for Hoodi sync verification

Complete Phase 0 analysis: evaluate ethrex, Reth, from-scratch, and
revm-only options via weighted decision matrix. ethrex fork selected
(score 4.85/5) for its custom LEVM, ZK-native architecture, Hook
system, and manageable 133K-line codebase.

Includes vision, competitive landscape, feature specs, team discussion
summaries, Volkov review history, and branch strategy.
- Rebalance decision matrix to ethrex vs Reth binary comparison;
  move "from scratch" and "revm only" to appendix
- Adjust Reth scores: ZK 1→2 (Zeth exists), manageability 2→3
  (modular arch acknowledged), sync 5→4 for ethrex (less battle-tested)
- Add EXIT criteria with 4 elements: metric, deadline, action, owner
- Add Tier S PoC section: perf_opcode_timings build verification
  and code path analysis
- Add JIT technical barriers (dynamic jumps, revmc reference)
- Fix weighted sum arithmetic (Reth 2.85→2.80)
Record completed work: DECISION.md creation, Volkov R6 review (6.5/10),
three mandatory fixes (matrix rebalance, EXIT criteria, Tier S PoC),
Reth/Zeth/ExEx research findings, and next steps for Phase 1.1.
- DECISION.md: DRAFT → FINAL
- Replace human staffing model with AI Agent development model
- Add bus factor policy (Kevin as interim decision-maker)
- Replace staffing risks with agent-specific risks
- Remove Senior Rust 2명 EXIT criterion
- Add 11 custom commands (.claude/commands/):
  - Development: /rust, /evm, /jit, /debugger, /l2
  - Verification: /quality-gate, /safety-review, /diff-test
  - Operations: /rebase-upstream, /phase, /bench
- Volkov R8: 7.5/10 PROCEED achieved
Architecture analysis documents:
- OVERVIEW.md: 25+2 crate dependency graph, node startup flow, CI inventory
- LEVM.md: VM struct, execution flow, dual-dispatch loop, hook system
- MODIFICATION-POINTS.md: 5 modification points, hybrid isolation strategy
- PHASE-1-1.md: Phase 1.1 execution plan with success criteria

Phase 1.1 infrastructure:
- Skeleton crates: tokamak-jit, tokamak-bench, tokamak-debugger
- Feature flag: `tokamak` propagation chain (cmd → vm → levm)
- Workspace registration for 3 new crates
- Fix OpcodeTimings: remove false min/max claim, document 4 actual fields
- Fix CallFrame: caller→msg_sender, Bytes→Code, return_data→output/sub_return_data
- Fix opcode table: describe const fn chaining pattern accurately
- Label all pseudocode snippets consistently (JIT, debugger, L2 hook)
- Plan feature flag split: tokamak → tokamak-jit/debugger/l2
- Add JIT-VM interface complexity analysis (5 challenges)
- Add failure scenarios & mitigations table (5 scenarios)
- Record build results: 5m53s clean, 718 tests passed
- Fix line count ~133K → ~103K (verified via wc -l)
- Add tokamak feature to OVERVIEW.md feature tables
Split monolithic `tokamak` feature into 3 independent features
(tokamak-jit, tokamak-debugger, tokamak-l2) with umbrella re-export.
Add pr-tokamak.yaml CI workflow for quality-gate and format checks.
Update snapsync action default image to tokamak-network/ethrex.
Document sync architecture, Hive test matrix, and success criteria.
Add structured benchmark infrastructure to tokamak-bench crate:
- timings.rs: reset(), raw_totals(), raw_counts() accessors
- tokamak-bench: types, runner, report, regression modules + CLI binary
- CI workflow: pr-tokamak-bench.yaml (bench PR vs base, post comparison)
- 11 unit tests covering regression detection, JSON roundtrip, reporting
Feature unification causes these modules to be compiled during L2
workspace clippy. Add targeted allows for arithmetic_side_effects,
as_conversions, expect_used, and unsafe_code lints.
Add the core JIT tiered compilation modules that were missing from
the branch: execution counter, code cache dispatch, types, and
module declaration. These provide the lightweight in-process
infrastructure gated behind the tokamak-jit feature flag.
- tokamak-jit: compiler, backend, adapter, validation, error modules
- JIT backend CI job with LLVM 18 in pr-tokamak.yaml
- jit_bench module in tokamak-bench for interpreter vs JIT comparison
- Phase 2 architecture documentation
- Updated HANDOFF with current status
Add Phase 3 JIT execution wiring so JIT-compiled bytecode actually
runs through the VM dispatch instead of only being compiled.

Key changes:
- JitBackend trait in dispatch.rs for dependency inversion (LEVM
  defines interface, tokamak-jit implements)
- LevmHost: revm Host v14.0 implementation backed by LEVM state
  (GeneralizedDatabase, Substate, Environment)
- Execution bridge: builds revm Interpreter, wraps state in LevmHost,
  transmutes CompiledCode to EvmCompilerFn, maps result to JitOutcome
- vm.rs wiring: try_jit_dispatch() && execute_jit() before interpreter
  loop, with fallback on failure
- register_jit_backend() for startup registration
- E2E tests: fibonacci JIT execution + JIT vs interpreter validation
  (behind revmc-backend feature, requires LLVM 21)
Close 7 gaps preventing production use of the JIT system:

- 4A: Propagate is_static from CallFrame to revm Interpreter
- 4B: Sync gas refunds after JIT execution, pass storage_original_values
  through JIT chain for correct SSTORE original vs present value
- 4C: Add LRU eviction to CodeCache (VecDeque + max_entries)
- 4D: Auto-compile when execution counter hits threshold, add compile()
  to JitBackend trait and backend() accessor to JitState
- 4E: Detect CALL/CREATE/DELEGATECALL/STATICCALL opcodes in analyzer,
  skip JIT compilation for contracts with external calls
- 4F: Skip JIT when tracer is active, add JitMetrics with atomic
  counters, log fallback events via eprintln
…compilation, and validation

Phase 5 addresses three remaining JIT gaps:

5A — Multi-fork support: Cache key changed from H256 to (H256, Fork) so the
same bytecode compiled at different forks gets separate cache entries.
fork_to_spec_id() adapter added. Hardcoded SpecId::CANCUN removed from
compiler, execution, and host — all now use the environment's fork.

5B — Background async compilation: New CompilerThread with std::sync::mpsc
channel and a single background thread. On threshold hit, vm.rs tries
request_compilation() first (non-blocking); falls back to synchronous
compile if no thread is registered. register_jit_backend() now also
starts the background compiler thread.

5C — Validation mode wiring: JitConfig.max_validation_runs (default 3)
gates logging to first N executions per (hash, fork). JitState tracks
validation_counts and logs [JIT-VALIDATE] with gas_used and output_len
for offline comparison. Full dual-execution deferred to Phase 6.
M1: CompilerThread now implements Drop — drops sender to signal
    shutdown, then joins the background thread. Panics are caught
    and logged (no silent swallowing). Fields changed to Option
    for take-on-drop pattern.

M2: SELFDESTRUCT (0xFF) added to has_external_calls detection in
    analyzer.rs. Bytecodes containing SELFDESTRUCT are now skipped
    by the JIT compiler, preventing the incomplete Host::selfdestruct
    (missing balance transfer) from being exercised.

M3: Negative gas refund cast fixed in execution.rs. Previously
    `refunded as u64` would wrap negative i64 (EIP-3529) to a huge
    u64. Now uses `u64::try_from(refunded)` — negative values are
    silently ignored (already reflected in gas remaining).

M4: Documented fork assumption in counter.rs and vm.rs. Counter is
    keyed by bytecode hash only (not fork). Safe because forks don't
    change during a node's runtime; cache miss on new fork falls back
    to interpreter.
…ment

Phase 6A — CALL/CREATE Resume:
- Add JitResumeState, SubCallResult, JitSubCall types for suspend/resume
- Add JitOutcome::Suspended variant for mid-execution suspension
- Extend JitBackend trait with execute_resume for resume-after-subcall
- Rewrite execution.rs: single-step execute, translate_frame_input,
  apply_subcall_result, handle_interpreter_action
- Add resume loop in vm.rs JIT dispatch block
- Add handle_jit_subcall() to execute sub-calls via LEVM interpreter
- Add run_subcall() with depth-bounded interpreter loop
- Remove has_external_calls compilation gate in backend.rs

Phase 6B — LLVM Memory Management:
- Add func_id: Option<u32> to CompiledCode for lifecycle tracking
- Return evicted func_id from CodeCache::insert() on eviction
- Add CompilerRequest enum (Compile/Free) to compiler_thread
- Add send_free() method for cache eviction notifications
- Wire Free request handling in register_jit_backend()
M1: Credit unused child gas back to revm interpreter via erase_cost()
M2: Write CALL output to interpreter memory at return_memory_offset
M3: Complete CREATE semantics (EIP-3860 initcode limit, nonce increment,
    EIP-170 code size check, deploy code storage)
M4: Extract shared interpreter_loop(stop_depth) to eliminate opcode
    dispatch table duplication between run_execution and run_subcall
M5: Add 7 tests for CALL/CREATE resume path (subcall.rs)
M6: Add balance validation before transfer in handle_jit_subcall
R13 fixes (3.0 → 6.0):
- M1: JIT CREATE tests exercising handle_jit_subcall CREATE arm
- M2: EIP-7702 delegation gap documented with TODO comment
- M3: Use from_bytecode_unchecked for CREATE init code
- R1: Precompile value transfer test with identity precompile
- R2: Non-precompile transfer guard aligned with generic_call
- R3: Comment reference format unified (no line numbers)

R14 fixes:
- M1: JitState::reset_for_testing() with clear() on CodeCache,
  ExecutionCounter, JitMetrics for test isolation across #[serial] tests
- M2: Differential JIT vs interpreter comparison in CREATE tests with
  jit_executions metrics assertion proving JIT path execution
- M3: Remaining line number reference removed from vm.rs
- R1: Precompile test strengthened with interpreter baseline comparison
- R2: CREATE collision JIT test with pre-seeded address verification

handle_jit_subcall CALL path: balance check, precompile BAL recording,
value transfer with EIP-7708 log, non-precompile BAL checkpoint.
handle_jit_subcall CREATE path: max nonce check, add_accessed_address,
BAL recording, collision check, deploy nonce, EIP-7708 log.
Gate test-only methods (reset_for_testing, clear, reset) behind
#[cfg(any(test, feature = "test-utils"))] to prevent production
exposure. Add missing reset_for_testing() calls to remaining serial
tests, gas_used differential assertions, and unit tests for new methods.
Add #[serial] + reset_for_testing() to storage test whose interpreter
path mutates global JIT_STATE. Add gas_used differential comparison to
fibonacci and staticcall JIT vs interpreter tests.
Replace the logging-only JIT validation with real dual execution:
when validation mode is active, the VM snapshots state before JIT,
executes JIT, then replays via interpreter on the original state
using mem::swap, and compares results. Mismatches trigger cache
invalidation and fallback to the interpreter result.

- Add Memory::deep_clone(), CallFrame::snapshot(), Substate::snapshot()
- Add validation.rs with DualExecutionResult comparison logic and tests
- Add validation_successes/validation_mismatches metrics to JitMetrics
- Add bytecode_has_external_calls() to skip validation for CALL/CREATE
  contracts where the state-swap mechanism cannot replay subcalls
- All new code feature-gated behind tokamak-jit
Four mandatory fixes from Comrade Volkov's review of Phase 7:

1. DB state comparison: validate_dual_execution now compares account
   balances, nonces, and storage slots for all modified accounts between
   JIT and interpreter execution snapshots.

2. Integration tests: two new tests in dual_execution.rs exercising the
   full VM dispatch path — one verifying Match via real revmc backend,
   one verifying mismatch triggers cache invalidation via mock backend.

3. interpreter_loop Err handling: validation replay failure now swaps
   back to JIT state instead of propagating with corrupted VM state.

4. Cache has_external_calls: CompiledCode now stores has_external_calls
   from AnalyzedBytecode at compile time, avoiding bytecode re-scan on
   every dispatch.
- Extract 4-way mem::swap into swap_validation_state() helper with
  ValidationSnapshot type alias, eliminating DRY violation (3x repeat)
- Add AccountStatus comparison in compare_account_states() to catch
  Destroyed vs Modified divergence, with unit test
- Add integration test for interpreter Err recovery path using
  FailingDatabase + SuccessBackend mock to verify swap-back restores
  JIT state when interpreter_loop returns InternalError
Add code_hash comparison to compare_account_states() in dual-execution
validation. CREATE/CREATE2 may deploy different code, so JIT vs
interpreter code_hash divergence must be detected.
Add JIT vs interpreter performance comparison to tokamak-bench,
feature-gated behind `jit-bench` so existing interpreter-only CI
works without LLVM 21.

- Add `tokamak-jit` optional dep and `jit-bench` feature gate
- Add `JitBenchResult` and `JitBenchSuite` types
- Implement full JIT benchmark runner (init, compile, prime counter,
  measure interpreter baseline then JIT execution, compute speedup)
- Add `jit-bench` CLI subcommand with --scenarios, --runs, --markdown
- Add JIT suite JSON/markdown report generation
- Make runner helpers pub(crate) for reuse by jit_bench module
- Update HANDOFF with Phase 8 completion
- Fix double mutable borrow in host.rs load_account_info_skip_cold_load
- Fix must_use warning on gas.record_cost in adapter.rs
- Add mem::forget(compiler) to prevent LLVM execution engine from freeing
  JIT code memory (dangling function pointer fix)
- Add jit_get_storage_value/jit_update_account_storage helpers in host.rs
  for proper storage access patterns
- Fix contracts_bin_dir() path resolution in runner.rs
- Handle JIT compilation failures gracefully in jit_bench.rs (skip
  scenarios with bytecode > 24KB revmc limit)
- Add [profile.jit-bench] cargo profile (no LTO) to avoid release-mode
  LLVM initialization segfault
- Remove debug eprintln statements from compiler.rs and execution.rs

Benchmark results (10 runs, jit-bench profile):
  Fibonacci:  1.21x speedup (3.02ms → 2.49ms)
  Factorial:  1.06x speedup (1.39ms → 1.31ms)
  ManyHashes: 1.38x speedup (3.43ms → 2.47ms)
  BubbleSort: 1.01x speedup (343ms → 338ms)
M1 (CRITICAL): Fix benchmark measuring JIT vs JIT. Interpreter baseline
now runs BEFORE JIT compilation using init_vm_interpreter_only() which
sets tracer.active=true to block JIT dispatch.

M2 (CRITICAL): Add storage rollback on REVERT. LevmHost now journals
all SSTORE writes and replays them in reverse on Revert. Journal
persists across suspend/resume cycles via JitResumeStateInner.

M3 (HIGH): Fix gas mismatch in apply_jit_outcome. Gas is now computed
from call_frame (gas_limit - gas_remaining) matching execution_handlers,
instead of using JitOutcome's execution-only gas_used.

M4 (MEDIUM): Add EIP-7928 BAL recording gap TODO comments at all four
JIT storage methods in host.rs.

M5 (MEDIUM): Enhance mem::forget SAFETY comment with memory impact
description and production mitigation guidance.
R22-1: Add SSTORE→REVERT rollback tests. Two JIT E2E tests verify that
storage journal correctly restores original values on REVERT:
- Single SSTORE→REVERT: slot 0 = 5 → write 0x42 → REVERT → slot 0 = 5
- Multi SSTORE→REVERT: slot 0 = 5 → write 10,20,30 → REVERT → slot 0 = 5

R22-2: Add JIT gas comparison test. Verifies apply_jit_outcome formula
(gas_limit - max(gas_remaining, 0)) produces correct gas_used. Fix
pre-existing test that incorrectly compared JitOutcome::gas_used
(execution-only) against interpreter gas_used (includes intrinsic).

R22-3: Re-run benchmarks with corrected M1 measurement:
- Fibonacci: 1.21x → 2.53x
- Factorial: 1.67x
- ManyHashes: 1.46x
- BubbleSort: 1.01x → 2.24x

R22-4: Add gas_remaining negative defense. Clamp gas_remaining to
max(0) before i64→u64 conversion in apply_jit_outcome to prevent
wrap-around on negative values.
M1: Fix tautological gas test — test_jit_gas_matches_interpreter now
cross-checks JIT execution gas + intrinsic (21000) against interpreter
gas_used from stateless_execute(), instead of comparing two values
derived from the same revm Gas struct.

M2: Add negative gas_remaining unit tests — two tests in vm::jit_tests
verify that apply_jit_outcome's max(0) clamp prevents i64→u64
wraparound when gas_remaining is negative (Success and Revert arms).

M3: Add different-slot REVERT test — test_two_slot_sstore_revert_rollback
writes to slot 0 AND slot 1 before REVERT, verifying the journal
correctly restores both distinct storage locations.
…-jit

Add Docker build with --features tokamak-jit and 6-suite Hive test matrix
to pr-tokamak.yaml, gated behind quality-gate. Create tokamak-sync.yaml
for manual Hoodi/Sepolia sync verification. Add build_flags input to
snapsync-run composite action for feature-flagged builds.
Phase A infra built (Hive CI, Sync workflow, feature flag checks).
CI verification pending — these docs track remaining work accurately.
@cd4761 cd4761 requested a review from a team as a code owner February 25, 2026 00:51
@cd4761 cd4761 changed the title feat(tokamak): proven execution — Hive, Sync, JIT phases 0-8B feat(levm): proven execution — Hive, Sync, JIT phases 0-8B Feb 25, 2026
- Pass github.head_ref through env var to avoid script injection
- Apply cargo fmt to tokamak-bench crate
@greptile-apps
Copy link

greptile-apps bot commented Feb 25, 2026

Greptile Summary

This PR implements a JIT-compiled EVM backend using revmc/LLVM (Phases 2-8B), along with comprehensive CI infrastructure for Hive testing and testnet sync verification.

Key Changes

JIT Compiler Implementation (tokamak-jit crate)

  • revmc/LLVM backend integration with tiered compilation (execution counter → auto-compile at threshold)
  • Multi-fork support with (bytecode_hash, fork) cache keys
  • Suspend/resume mechanism for CALL/CREATE opcodes via revm's InterpreterAction::NewFrame
  • LevmHost implementing revm's Host trait to bridge LEVM state to JIT-compiled code
  • Storage journal for rollback on REVERT

Dual-Execution Validation (levm/jit module)

  • State-swap validation: snapshot before JIT → execute JIT → swap → execute interpreter → compare
  • Comprehensive comparison: status, gas, output, refunded gas, logs, and full DB state (balance, nonce, storage)
  • Cache invalidation on mismatch, automatic fallback to interpreter result
  • Validation mode active for first 3 runs per (hash, fork) pair

CI/CD Infrastructure

  • 6 Hive test suites (RPC Compat, Devp2p, Engine Auth/Cancun/Paris/Withdrawals) in pr-tokamak.yaml
  • Testnet sync verification workflow (tokamak-sync.yaml) for Hoodi/Sepolia with manual dispatch
  • Quality Gate checks for 4 feature flags: tokamak, tokamak-jit, tokamak-debugger, tokamak-l2
  • Docker build with --features tokamak-jit
  • JIT backend build job with LLVM 21 (continue-on-error due to LLVM availability variance)

Benchmarking Infrastructure

  • tokamak-bench crate with 12 scenarios (Fibonacci, BubbleSort, ERC20, etc.)
  • Dual-mode execution (interpreter vs JIT) with speedup measurement
  • Regression detection framework (not yet integrated into PR CI)
  • Initial results: 1.46-2.53x speedup on pure-computation benchmarks

Documentation

  • 4,688 lines of documentation covering architecture, roadmap, status tracking, and Volkov review responses
  • Honest progress reporting: Phases 2-8B complete (100%), Phase A infra built but verification pending

Known Limitations (Documented)

  • Memory leak: std::mem::forget(compiler) in compiler.rs:80 intentionally leaks LLVM context (1-5MB per contract) for PoC. Production needs persistent context with LRU eviction.
  • EIP-7928 BAL recording: Not implemented for JIT path (4 TODO comments in host.rs). May cause compliance issues when EIP-7928 activates.
  • Fork upgrade assumption: Execution counter keyed by hash only, not (hash, fork). Assumes forks don't change during runtime. Live fork transitions may require manual cache invalidation.
  • revmc bytecode size limit: 24KB limit means some contracts (large PUSH sequences, MSTORE/SSTORE benchmarks) cannot be JIT-compiled.
  • Recursive CALL performance: Suspend/resume overhead makes recursive contracts slower than interpreter.

Test Coverage

  • 39 LEVM JIT tests + 19 tokamak-jit tests passing
  • 3,917 lines of test code in tokamak-jit/src/tests
  • Integration tests for dual-execution validation (Match and Mismatch paths)
  • Serial tests with JIT_STATE.reset_for_testing() for isolation

Verification Status (Per ROADMAP-REMAINING.md)

Phase A: Infrastructure Built, Verification Pending

  • CI workflows created and merged ✅
  • Hive 6 suites configured ✅
  • Hoodi/Sepolia sync workflow ready ✅
  • NOT YET EXECUTED: Hive tests, testnet sync, feature flag safety comparison

Confidence Score: 3/5

  • This PR is safe to merge for continued development but NOT production-ready. The JIT implementation is well-architected with comprehensive validation, but has documented PoC limitations (memory leaks, missing EIP-7928 support) and lacks real-world verification (Hive/sync tests not yet executed).
  • Score of 3 reflects solid engineering quality (clean architecture, extensive tests, dual-execution validation) balanced against unverified production readiness. The CI infrastructure is built but the critical verification step (Hive pass rate, testnet sync) has not been run. Memory leak in compiler.rs is intentional for PoC but would be a blocker for production. Feature flags provide isolation, reducing risk to main branch.
  • Pay close attention to crates/vm/tokamak-jit/src/compiler.rs (memory leak), crates/vm/levm/src/vm.rs (complex state-swap validation logic), and verify that Hive/sync tests pass before considering this production-ready.

Important Files Changed

Filename Overview
.github/workflows/pr-tokamak.yaml Adds comprehensive CI: 6 Hive suites, quality gate for 4 feature flags, JIT backend build with LLVM 21, Docker build verification. Uses deprecated apt-key.
crates/vm/tokamak-jit/src/compiler.rs revmc/LLVM compilation wrapper. Intentionally leaks compiler context for PoC (1-5MB per contract). Needs persistent LLVM context for production.
crates/vm/tokamak-jit/src/execution.rs JIT execution bridge: builds revm Interpreter/Host, executes compiled code, handles suspend/resume for CALL/CREATE. Storage journal rollback on revert implemented correctly.
crates/vm/tokamak-jit/src/host.rs LevmHost implements revm Host trait, bridges LEVM state to JIT-compiled code. EIP-7928 BAL recording not implemented (4 TODO comments). SLOAD/SSTORE logic looks correct.
crates/vm/levm/src/vm.rs Massive JIT integration (~800 lines added): counter increment, auto-compile, dispatch, suspend/resume loop, dual-execution validation with state-swap. Complex but well-documented.
crates/vm/levm/src/jit/dispatch.rs JIT state management: cache, counter, config, backend registration, validation counts. Clean dependency inversion via JitBackend trait.
crates/vm/levm/src/jit/validation.rs Dual-execution validation: compares JIT vs interpreter on status, gas, output, refunded gas, logs, and full DB state (balance, nonce, storage). Comprehensive coverage.
crates/vm/levm/src/jit/cache.rs Thread-safe code cache with FIFO eviction. Function pointers stored type-erased. Safe for concurrent reads. Note: LLVM memory not freed on eviction (revmc limitation).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[VM.run_execution] --> B{JIT feature enabled?}
    B -->|No| Z[Interpreter Loop]
    B -->|Yes| C[Increment execution counter]
    C --> D{Count == threshold?}
    D -->|Yes| E[Compile via revmc/LLVM]
    E --> F[Insert into JIT_STATE.cache]
    D -->|No| G{Bytecode in cache?}
    F --> G
    G -->|No| Z
    G -->|Yes| H{Validation mode?}
    H -->|No| I[Execute JIT]
    H -->|Yes| J[Snapshot state: db, call_frame, substate]
    J --> K[Execute JIT]
    K --> L{Suspended for CALL/CREATE?}
    L -->|Yes| M[Execute subcall via interpreter]
    M --> N[Resume JIT with subcall result]
    N --> L
    L -->|No| O[Swap to snapshot state]
    O --> P[Execute Interpreter]
    P --> Q[Compare: gas, output, logs, DB state]
    Q --> R{Match?}
    R -->|Yes| S[Swap back to JIT state]
    S --> T[Return JIT result]
    R -->|No| U[Cache.invalidate]
    U --> V[Return interpreter result]
    I --> W{Suspended?}
    W -->|Yes| M
    W -->|No| X[Return JIT result]
Loading

Last reviewed commit: cbef92f

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

84 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +73 to +74
run: |
wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using apt-key add is deprecated. Consider using signed-by in the repository definition instead:

Suggested change
run: |
wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
wget -qO /tmp/llvm.key https://apt.llvm.org/llvm-snapshot.gpg.key
echo "deb [signed-by=/tmp/llvm.key] http://apt.llvm.org/jammy/ llvm-toolchain-jammy-21 main" | sudo tee /etc/apt/sources.list.d/llvm-21.list
Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/pr-tokamak.yaml
Line: 73-74

Comment:
Using `apt-key add` is deprecated. Consider using signed-by in the repository definition instead:

```suggestion
          wget -qO /tmp/llvm.key https://apt.llvm.org/llvm-snapshot.gpg.key
          echo "deb [signed-by=/tmp/llvm.key] http://apt.llvm.org/jammy/ llvm-toolchain-jammy-21 main" | sudo tee /etc/apt/sources.list.d/llvm-21.list
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +70 to +80
// SAFETY: The compiled function pointer is owned by the LLVM execution engine
// inside the compiler/backend. Dropping the compiler would free the JIT code
// memory, invalidating the pointer. We intentionally leak the compiler so the
// JIT code lives for the entire process lifetime.
//
// MEMORY IMPACT: Each compilation leaks one EvmCompiler + EvmLlvmBackend
// (~1-5 MB LLVM module/machine code per contract). In a long-running node,
// this grows proportionally to the number of unique contracts compiled.
// Acceptable for PoC; production should use a persistent LLVM context with
// explicit lifetime management or a bounded LRU eviction policy.
std::mem::forget(compiler);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory leak: std::mem::forget(compiler) intentionally leaks LLVM context. Each compilation permanently allocates 1-5 MB that's never freed. In long-running production nodes, this grows unbounded proportional to unique contracts encountered. Consider implementing a persistent LLVM context with LRU eviction or reference-counted lifetime management.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/vm/tokamak-jit/src/compiler.rs
Line: 70-80

Comment:
Memory leak: `std::mem::forget(compiler)` intentionally leaks LLVM context. Each compilation permanently allocates 1-5 MB that's never freed. In long-running production nodes, this grows unbounded proportional to unique contracts encountered. Consider implementing a persistent LLVM context with LRU eviction or reference-counted lifetime management.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +8 to +17
//!
//! The counter is keyed by bytecode hash only (not `(hash, fork)`).
//! This means the compilation threshold fires once per bytecode regardless
//! of fork. This is correct under the assumption that **forks do not change
//! during a node's runtime** — a node runs at a single fork for any given
//! block height. If this assumption is violated (e.g., fork upgrade during
//! live operation), bytecodes compiled for the old fork would not be
//! recompiled for the new fork via the threshold mechanism. The cache
//! lookup (`try_jit_dispatch`) would return `None` for the new fork key,
//! causing a safe fallback to the interpreter.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fork assumption may break during runtime fork upgrades. The counter is keyed by bytecode hash only, not (hash, fork). When a fork upgrade occurs (e.g. Cancun → Prague), bytecode compiled for Cancun won't trigger recompilation for Prague until manually invalidated. Consider documenting mitigation strategy for live fork transitions or adding fork-aware counter keys.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/vm/levm/src/jit/counter.rs
Line: 8-17

Comment:
Fork assumption may break during runtime fork upgrades. The counter is keyed by bytecode hash only, not `(hash, fork)`. When a fork upgrade occurs (e.g. Cancun → Prague), bytecode compiled for Cancun won't trigger recompilation for Prague until manually invalidated. Consider documenting mitigation strategy for live fork transitions or adding fork-aware counter keys.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +210 to +212
// TODO(JIT): EIP-7928 BAL recording not implemented for JIT execution path.
// LEVM's get_storage_value records BAL entries via bal_recorder. The JIT path
// bypasses this. Add BAL recording when JIT moves beyond PoC phase.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EIP-7928 BAL (Balance Access List) recording is not implemented for JIT execution path. LEVM's get_storage_value records BAL entries via bal_recorder, but the JIT path (jit_get_storage_value) bypasses this. This may cause issues with EIP-7928 compliance once activated.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/vm/tokamak-jit/src/host.rs
Line: 210-212

Comment:
EIP-7928 BAL (Balance Access List) recording is not implemented for JIT execution path. LEVM's `get_storage_value` records BAL entries via `bal_recorder`, but the JIT path (`jit_get_storage_value`) bypasses this. This may cause issues with EIP-7928 compliance once activated.

How can I resolve this? If you propose a fix, please make it concise.

Fixes format differences between local and CI environments.
Upstream added PrecompileCache parameter to execute_precompile.
Updated JIT dispatch call sites to pass precompile_cache().
B-2 (Volkov R24): Extract shared test helpers (make_test_db, make_test_env,
make_test_tx) to eliminate 15+ duplicate DB setup patterns across tokamak-jit
test files. Replace magic number 21_000u64 with INTRINSIC_GAS constant.

C-3: Add per-run timing collection, warmup run discarding, and statistical
analysis (mean, stddev, 95% CI) to tokamak-bench. New stats module with
BenchStats struct, --warmup CLI parameter, and suite_stats_to_markdown().
Apply cargo fmt to tokamak-bench and tokamak-jit after merging
upstream changes including stats.rs and precompile cache updates.
A-1 Hive 6/6 PASS, A-3 Feature Flag verified, A-4 criteria 1-8 PASS.
Only A-2 Snapsync remains (manual workflow_dispatch needed).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants