-
Notifications
You must be signed in to change notification settings - Fork 30
feat: coverage-guided fuzzing for Tempo precompiles #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: tempo
Are you sure you want to change the base?
Conversation
d0c244b to
f0193d1
Compare
Add SanitizerCoverage instrumentation support to collect edge coverage from native Rust precompile execution during Foundry invariant/fuzz tests. This makes the fuzzer coverage-guided for Tempo precompile code paths, which were previously invisible to the EVM-level EdgeCovInspector. Changes: - New crate: foundry-tempo-coverage (crates/evm/tempo-coverage/) Provides thread-local coverage map, SanitizerCoverage callbacks (__sanitizer_cov_trace_pc_guard, __sanitizer_cov_trace_pc_guard_init), and RAII guard for map lifecycle management. - New config field: tempo_precompile_coverage (bool, default false) Added to FuzzCorpusConfig, controls whether Tempo precompile coverage collection is active. When enabled, implicitly enables edge coverage. - Executor wiring (TempoCoverageGuard in tempo_cov.rs) Wraps EVM transact calls with a coverage map guard that collects sancov hits into a scratch buffer, then merges them into the RawCallResult edge_coverage for the CorpusManager to consume. - New build profile: fuzz (inherits release, no LTO/strip) Preserves sancov instrumentation through linking. - Build script: scripts/build-fuzz.sh Convenience script to build forge with sancov RUSTFLAGS. No changes to the tempo repository are required. SanitizerCoverage is a compiler pass applied via RUSTFLAGS at build time; the callback symbols are resolved by the linker from foundry-tempo-coverage. Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-019c2f47-ec1b-76d3-ac35-429e06bf2460
f0193d1 to
7c8e778
Compare
Replace global sancov RUSTFLAGS with a RUSTC_WRAPPER that only injects sancov flags when compiling the tempo-precompiles crate. This eliminates coverage noise from all other crates in the binary (6,810 guards vs previously instrumenting every crate). Since only tempo-precompiles carries sancov instrumentation, thin LTO can safely be re-enabled for the fuzz profile — the linker won't strip the callbacks from the precompile code, while all other crates benefit from LTO optimizations. Changes: - Add scripts/sancov-rustc-wrapper.sh (RUSTC_WRAPPER) - Update scripts/build-fuzz.sh to use wrapper instead of global flags - Simplify tempo-coverage lib.rs (no runtime filtering needed) - Remove object crate dependency from tempo-coverage - Set lto="thin" in [profile.fuzz] Amp-Thread-ID: https://ampcode.com/threads/T-019c2fb4-7c50-77eb-876f-1edf442f48bc Co-authored-by: Amp <amp@ampcode.com>
Add trace-cmp sanitizer coverage for tempo_precompiles to capture comparison operands and inject them into the fuzz dictionary, improving the fuzzer's ability to solve comparison guards in precompile code. Changes: - sancov-rustc-wrapper.sh: add -sanitizer-coverage-trace-compares flag - tempo-coverage/src/lib.rs: implement 9 __sanitizer_cov_trace_cmp callbacks + drain_cmp_operands() to capture comparison operands - executors/tempo_cov.rs: drain operands into RawCallResult.tempo_cmp_values - executors/mod.rs: add tempo_cmp_values field to RawCallResult - executors/invariant/mod.rs: inject captured values as persistent dictionary entries during collect_data phase Feedback loop improvements: - Persistent trace-cmp values: comparison operands now survive dictionary revert() across invariant runs, compounding over the campaign instead of being discarded every run - Havoc arg mutation post-pass: after structural corpus mutations (Splice/Repeat/Interleave/Prefix/Suffix), ~30% of calls get their args mutated using dictionary values, ensuring trace-cmp operands reach function arguments - Fresh random lane: 10% of corpus-guided runs generate pure-random sequences, recovering paths that corpus mutation alone might miss Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-019c323f-63e7-77db-9a97-3cd859f812a4
Weighted corpus scheduling: - Replace uniform random corpus selection with productivity-weighted sampling: weight = eps + (finds + 1) / (mutations + 10) - Unseen entries (0 mutations) are always tried first - 10% uniform random exploration for anti-starvation - Applied to both invariant (new_inputs) and stateless (new_input) Typed trace-cmp values: - Add CmpSample struct with width field (8/16/32/64 bits) from sancov trace-cmp callbacks instead of raw [u8; 32] - Insert trace-cmp operands into typed sample_values buckets keyed by DynSolType::Uint(N) matching the comparison width - Promote <=64-bit values into uint128/uint256/int256 buckets so limb comparisons help solve wider integer guards - Cap per-type bucket at 1024 entries Benchmark (30 runs, depth 100): StablecoinDEX: corpus-tempo-cov 2842 vs no-corpus 2837 (+5 unique, 0 missed) FeeAMM: corpus-tempo-cov 2073 vs no-corpus 2070 (+3 unique, 0 missed) Amp-Thread-ID: https://ampcode.com/threads/T-019c3339-7526-700c-84aa-922ff62a74ae Co-authored-by: Amp <amp@ampcode.com>
Benchmark: Weighted Corpus Scheduling + Typed Trace-Cmp (commit
|
| Test | Mode | Lines Hit | Unique to this mode | Edges | Corpus | Favored |
|---|---|---|---|---|---|---|
| StablecoinDEX | no-corpus | 2837/10701 | 0 | — | — | — |
| corpus-tempo-cov | 2842/10701 | +5 | 47 | 13 | 7 | |
| FeeAMM | no-corpus | 2070/10701 | 0 | — | — | — |
| corpus-tempo-cov | 2073/10701 | +3 | 82 | 16 | 10 |
corpus-tempo-cov is a strict superset — every line no-corpus finds, tempo-cov also finds, plus additional lines.
Lines unique to tempo-cov (seed 99999)
StablecoinDEX — 5 lines in stablecoin_dex/mod.rs:
- L890-892:
checked_sub(order.remaining())underflow guard +amount_in = 0fallback — orderbook fill-remainder path when a taker partially fills an order - L916-917:
insufficient_liquidityerror return + loop break — the exhausted-liquidity exit path when the orderbook runs out of matching orders
FeeAMM — 3 lines across fee manager:
tip_fee_manager/amm.rs:166—RebalanceSwapevent emission (successful rebalance transfer)tip_fee_manager/amm.rs:378—Burnevent emission pathtip_fee_manager/mod.rs:208—FeesDistributedevent emission
Seed 42424
| Test | Mode | Lines Hit | Unique to this mode | Edges | Corpus | Favored |
|---|---|---|---|---|---|---|
| StablecoinDEX | no-corpus | 2837/10701 | 0 | — | — | — |
| corpus-tempo-cov | 2837/10701 | 0 | 44 | 15 | 10 | |
| FeeAMM | no-corpus | 2073/10701 | 1 | — | — | — |
| corpus-tempo-cov | 2073/10701 | 1 | 76 | 16 | 8 |
StablecoinDEX: exact parity. FeeAMM: 1 line each unique — tempo-cov finds amm.rs:378 (Burn event), no-corpus finds mod.rs:208 (FeesDistributed event).
Observations
- No regression: corpus-tempo-cov never misses more than 1 line that no-corpus finds across both seeds.
- Consistent edge discovery: 44–47 edges for StablecoinDEX, 76–82 for FeeAMM — the coverage guidance is stable across seeds.
- Weighted scheduling active: 7–10 of 13–16 corpus entries are favored, meaning the productivity-based weights successfully differentiate productive seeds.
- Improvement over previous trace-cmp-only version (commit
2373cd574): StablecoinDEX went from parity (0 unique) to +5 unique lines. FeeAMM went from 1-unique/1-missed to 3-unique/0-missed (seed 99999).
What changed in this commit
- Weighted corpus scheduling: Replaced uniform random corpus selection with roulette-wheel sampling using smoothed productivity ratio
weight = 0.01 + (finds + 1) / (mutations + 10). Unseen seeds prioritized, 10% uniform exploration for anti-starvation. - Typed trace-cmp values:
CmpSamplestruct carries width (8/16/32/64) from sancov callbacks. Values inserted intosample_valuesbuckets byDynSolType::Uint(N)with promotion to uint128/uint256/int256, so limb comparisons help solve wider integer guards.
…trace_cmp flags Split the single boolean into two independent configuration flags for granular control: - : controls edge coverage collection and merging into the EVM hitcount map - : controls comparison operand capture and dictionary injection This enables four scenarios: 1. Both false: no precompile coverage 2. Edges only: coverage-guided fuzzing without trace-cmp overhead 3. Trace-cmp only: dictionary enrichment without edge merging 4. Both true: full coverage + trace-cmp (previous behavior) Amp-Thread-ID: https://ampcode.com/threads/T-019c383c-a333-72bc-931c-bb52f31873bf Co-authored-by: Amp <amp@ampcode.com>
Benchmark: Split Config Flags (
|
| Config | tempo_precompile_edges |
tempo_precompile_trace_cmp |
Description |
|---|---|---|---|
| None | false |
false |
Baseline — no Tempo precompile coverage feedback |
| Edges only | true |
false |
Edge coverage from Rust precompile code is merged into the EVM hitcount map, guiding the fuzzer toward inputs that exercise new branches |
| Trace-cmp only | false |
true |
Comparison operands from Rust precompile if/match arms are captured and injected into the fuzz dictionary, helping generate values that pass guards |
| Both | true |
true |
Combined: edge-guided exploration + dictionary enrichment from trace-cmp |
Metrics:
| Metric | Meaning |
|---|---|
| Precompile LOC Hit | Unique lines of Rust precompile source code executed (via LLVM instrument-coverage + sancov) |
| Cumulative Edges | Unique control-flow edges (branch transitions) seen across all runs. Higher = more branch combinations explored in precompile code |
| Cumulative Features | New hitcount buckets for previously seen edges (e.g., an edge hit 1× vs 8× vs 128× are different features). Higher = deeper exploration of loop/recursion behavior |
| Time | Wall-clock execution time for the full invariant test suite |
Parameters
- Seed: 123
- Runs: 2
- Depth: 2000
- Binary: instrumented forge with sancov (
-Cinstrument-coverage+sanitizer-coverage-trace-compares)
StablecoinDEX Results
| Config | Precompile LOC Hit | Edges | Δ Edges | Features | Δ Features | Time | Δ Time |
|---|---|---|---|---|---|---|---|
| None | 2,296 / 5,370 | 44 | — | 37 | — | 659.8s | — |
| Edges only | 2,294 / 5,370 | 65 | +48% | 61 | +65% | 643.9s | -2% |
| Trace-cmp only | 2,293 / 5,370 | 45 | +2% | 36 | -3% | 551.5s | -16% |
| Both | 2,298 / 5,370 | 70 | +59% | 54 | +46% | 371.6s | -44% |
FeeAMM Results
| Config | Precompile LOC Hit | Edges | Δ Edges | Features | Δ Features | Time | Δ Time |
|---|---|---|---|---|---|---|---|
| None | 1,533 / 5,370 | 89 | — | 18 | — | 141.0s | — |
| Edges only | 1,533 / 5,370 | 93 | +4% | 17 | -6% | 147.4s | +5% |
| Trace-cmp only | 1,533 / 5,370 | 89 | 0% | 13 | -28% | 165.7s | +18% |
| Both | 1,533 / 5,370 | 93 | +4% | 20 | +11% | 157.2s | +11% |
Conclusions
-
"Both" is the clear winner on StablecoinDEX. It ran 44% faster than baseline (371s vs 660s) while achieving the highest edge count (+59%) and the only net-positive LOC delta (+2 unique lines). The trace-cmp dictionary helps the fuzzer find valid inputs faster, reducing wasted reverts, while edge feedback steers toward new branches.
-
Synergy between edges and trace-cmp. Neither mode alone matches the combined result. Edges provide direction (which branches to explore), trace-cmp provides values (dictionary entries that pass comparison guards). Together they converge faster and deeper.
-
Edge coverage scales with run length. At 2×2000 depth, edges-only sees 65 cumulative edges (+48%) and 61 features (+65%) vs baseline. The fuzzer accumulates coverage intelligence across runs.
-
Trace-cmp alone doesn't improve edge/feature counts — it enriches the dictionary but without edge feedback the fuzzer can't prioritize inputs that explore new branches.
-
FeeAMM is LOC-saturated at 1,533 lines across all configs. The AMM's simpler control flow means handlers already cover all reachable paths. "Both" still achieves the best feature count (+11%).
-
Recommended defaults:
tempo_precompile_edges = true+tempo_precompile_trace_cmp = truefor production fuzzing. For quick smoke tests,trace_cmpalone is cheapest.
How It Works: Precompile Coverage Feedback LoopFlow
Diagramflowchart TB
subgraph FUZZER["Invariant Fuzzer Loop"]
direction TB
GEN["1. Generate inputs<br/>new_inputs() from corpus<br/>or fresh proptest tx"]
EXEC["2. Execute tx<br/>executor.transact_with_env()"]
COLLECT["5. Collect feedback"]
DECIDE["6. Corpus decision<br/>new coverage? → add sequence<br/>to in_memory_corpus"]
MUTATE["7. Next iteration<br/>mutate corpus entry<br/>or generate fresh tx"]
GEN --> EXEC
COLLECT --> DECIDE
DECIDE --> MUTATE
MUTATE --> GEN
end
subgraph GUARD["TempoCoverageGuard (per tx)"]
direction TB
GINIT["Activate guard<br/>set_coverage_map(buf)<br/>clear_cmp_operands()"]
GDROP["Drop guard<br/>clear_coverage_map()"]
end
subgraph EVM["EVM Execution"]
direction TB
SOL["Solidity handler<br/>e.g. placeOrder(), swap()"]
PRECOMP["Tempo Precompile<br/>(Rust, sancov-instrumented)"]
SOL -->|"CALL to<br/>precompile address"| PRECOMP
end
subgraph SANCOV["SanitizerCoverage Callbacks (compiler-injected)"]
direction LR
PCGUARD["__sanitizer_cov_trace_pc_guard<br/>→ record_hit(guard_id)<br/>writes to coverage_map buf"]
TRACECMP["__sanitizer_cov_trace_cmp 1,2,4,8<br/>→ record_cmp(width, arg1, arg2)<br/>pushes to CMP_OPERANDS"]
end
subgraph MERGE["3. Merge results into RawCallResult"]
direction TB
MEDGE["merge_edges_into()<br/>Tempo hits + EVM EdgeCovInspector<br/>→ result.edge_coverage"]
MCMP["drain_cmp_into()<br/>→ result.tempo_cmp_values"]
end
subgraph FEEDBACK["4. Feedback paths"]
direction TB
FEDGE["Edge coverage path<br/>merge_edge_coverage()<br/>update history_map<br/>detect new edges/features"]
FCMP["Trace-cmp path<br/>collect_typed_cmp_values()<br/>inject into fuzz dictionary<br/>typed buckets: u8..u256"]
end
EXEC --> GINIT
GINIT --> SOL
PRECOMP -->|"every branch"| PCGUARD
PRECOMP -->|"every if/match<br/>comparison"| TRACECMP
SOL --> GDROP
GDROP --> MERGE
MEDGE --> FEDGE
MCMP --> FCMP
MERGE --> COLLECT
FEDGE -->|"new_coverage=true<br/>→ save to corpus"| DECIDE
FCMP -->|"values like<br/>balance thresholds,<br/>price ticks"| GEN
Key files
|
Amp-Thread-ID: https://ampcode.com/threads/T-019c4850-fe1a-74f8-8b66-bfcf76748cd0 Co-authored-by: Amp <amp@ampcode.com>
Summary
Adds SanitizerCoverage instrumentation support so that Foundry's invariant/fuzz test fuzzer gets coverage feedback from native Rust precompile execution. Previously, the
EdgeCovInspectoronly tracked EVM-levelJUMP/JUMPIopcodes, making the fuzzer effectively black-box for all Tempo precompile logic.Problem
When a Solidity test calls a Tempo precompile (TIP20, StablecoinDEX, AccountKeychain, etc.), Foundry executes the
DynPrecompileRust closure directly — no EVM opcodes run, no interpreter loop fires. TheEdgeCovInspectorproduces zero coverage signal, so theCorpusManagercannot prioritize inputs that explore new precompile code paths. The fuzzer degenerates to random input generation for the most critical code under test.Solution
Use LLVM SanitizerCoverage (
trace-pc-guard+trace-compares) to instrumenttempo-precompilesat compile time and feed the coverage into the same 65536-byte hitcount map thatEdgeCovInspectoruses. No changes to thetemporepository are required — sancov is a compiler pass applied via RUSTFLAGS, and the callback symbols are resolved by the linker.Architecture
Changes
New crate:
foundry-tempo-coverage(crates/evm/tempo-coverage/)set_coverage_map/clear_coverage_map)__sanitizer_cov_trace_pc_guardand__sanitizer_cov_trace_pc_guard_initcallback implementationsAtomicU32__sanitizer_cov_trace_cmp{1,2,4,8}callbacks to capture comparison operandsdrain_cmp_operands()to collect captured operands into the fuzz dictionaryConfig:
tempo_precompile_coveragefieldFuzzCorpusConfig(applies to both[fuzz]and[invariant]sections)falseExecutor wiring:
TempoCoverageGuard(tempo_cov.rs)call_with_envandtransact_with_envRawCallResult.edge_coverageRawCallResult.tempo_cmp_valuesCorpusManager.merge_edge_coverage()pipeline works unchangedtrace-cmp dictionary injection
FuzzDictionaryas persistent values (surviverevert()across invariant runs)Feedback loop improvements
Build profile:
fuzzrelease(opt-level 3, codegen-units 16)debug = "line-tables-only"for coverage report source mappingBuild script:
scripts/build-fuzz.shConvenience wrapper:
Benchmark Results
30 runs / depth 100 / seed 99999 —
no-corpus(plain random) vscorpus-tempo-cov(sancov edge coverage + trace-cmp + dictionary injection):StablecoinDEX
FeeAMM
tip_fee_manager/mod.rs:208tip_fee_manager/amm.rs:166Conclusions
Usage