feat(edge_cov): collision-free dense edge IDs by gakonst · Pull Request #404 · paradigmxyz/revm-inspectors

gakonst · 2026-02-12T16:44:45Z

Summary

Replace the hash-modulo scheme (hash(addr,pc,dest) % 65536) with a HashMap-based dense ID assignment that eliminates edge collisions in coverage-guided fuzzing.

Motivation

The current EdgeCovInspector hashes each (address, pc, jump_dest) tuple and truncates to a 65536-entry buffer. With large contracts or instrumented native code (e.g. sancov-instrumented precompiles), distinct edges frequently alias to the same hitcount slot, corrupting the coverage signal. The fuzzer can't distinguish "new edge A" from "more hits on existing edge B", degrading guidance quality.

Changes

EdgeCovInspector now holds a HashMap<(Address, usize, U256), usize> mapping each unique edge to a dense monotonic ID
New edges are assigned IDs on the cold path (first encounter); known edges hit the Occupied fast path — effectively the same cost as the previous hash, since both require hashing the same key
Buffer pre-allocated to 65536 entries (configurable via with_capacity()); edges beyond capacity are silently dropped
get_hitcount() returns only the used portion [0..edge_count()] instead of the full buffer
reset() clears counters but preserves ID assignments for reuse across iterations
Hitcount uses saturating_add instead of checked_add().unwrap_or()

New public API (backward-compatible additions):

with_capacity(n) — size the buffer for workloads with more edges
edge_count() — number of unique edges discovered
into_hitcount_with_size() — returns (buffer, used_size) so consumers only process meaningful entries
hitcount_mut() — mutable access for external coverage writers (e.g. sancov)
into_hitcount() — preserved for backward compatibility

Perf

Hot path (known edges): HashMap::get is ~same cost as SipHash + modulo since both hash the same 3 fields
Cold path (new edges): HashMap::insert — ~50ns, happens once per unique edge, amortized over millions of iterations
Merge step (downstream): iterating [0..used] instead of full 65536 is a 2-10x speedup for typical edge counts
Memory: HashMap overhead is ~500KB-2MB for 10-50K edges — negligible for fuzzing workloads

Testing

8 new unit tests covering: collision-free IDs, same-edge increment, saturation at 255, capacity exhaustion, reset preserving IDs, into_hitcount_with_size, cross-address edge distinction, debug format
Existing integration test (test_edge_coverage) passes unchanged

Replace the hash-modulo scheme (hash(addr,pc,dest) % 65536) with a HashMap that assigns each unique (address, pc, jump_dest) edge a monotonically-increasing dense ID into a pre-allocated hitcount buffer. This eliminates coverage map collisions where two unrelated edges share the same counter, corrupting the feedback signal for coverage-guided fuzzers. Key changes: - EdgeCovInspector now holds a HashMap<(Address, usize, U256), usize> for edge-to-ID mapping and a next_id counter - New edges get a dense ID on first encounter (cold path); known edges hit the HashMap Occupied path (hot, O(1) amortized) - Buffer pre-allocated to 65536 (configurable via with_capacity()); edges beyond capacity are silently dropped - get_hitcount() returns only the used portion [0..edge_count()] - New API: with_capacity(), edge_count(), into_hitcount_with_size(), hitcount_mut() for downstream integration - into_hitcount() preserved for backward compatibility - reset() clears counters but preserves ID assignments across iterations - Hitcount uses saturating_add (no overflow past 255) Amp-Thread-ID: https://ampcode.com/threads/T-019c528b-16c7-7700-8700-02529160df29 Co-authored-by: Amp <amp@ampcode.com>

gakonst · 2026-02-12T16:45:03Z

cc @grandizzy for review

Amp-Thread-ID: https://ampcode.com/threads/T-019c528b-16c7-7700-8700-02529160df29 Co-authored-by: Amp <amp@ampcode.com>

0xalpharush · 2026-02-12T17:35:55Z

src/edge_cov.rs

+///
+/// The hitcount buffer is fixed at construction time; if more unique edges
+/// are discovered than the buffer can hold the extras are silently
+/// dropped. Use [`EdgeCovInspector::with_capacity`] to size the buffer


Instead of dropping, the map should grow

pls see 7b35651

0xalpharush · 2026-02-12T17:38:19Z

src/edge_cov.rs

+            }
+        };
+        if let Some(slot) = self.hitcount.get_mut(id) {
+            *slot = slot.saturating_add(1);


This was purposeful self.hitcount[edge_id].checked_add(1).unwrap_or(1);

From https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.llvm.md#8-neverzero-counters

NeverZero prevents this behavior. If a counter wraps, it jumps over the value 0 directly to a 1. This improves path discovery (by a very small amount) at a very low cost (one instruction per edge).
(The alternative of saturated counters has been tested also and proved to be inferior in terms of path discovery.)

pls see 7b35651

0xalpharush · 2026-02-12T17:40:59Z

src/edge_cov.rs

-        // so it must be modulo the maximum edge count.
-        let edge_id = (hasher.finish() % MAX_EDGE_COUNT as u64) as usize;
-        self.hitcount[edge_id] = self.hitcount[edge_id].checked_add(1).unwrap_or(1);
+        let id = match self.edge_ids.entry((address, pc, jump_dest)) {


Unrelated but while we are here, it may be nice to incorporate the call depth (xref crytic/echidna#624). This will distinguish a top-level call from a nested call.

Ofc sometimes more precision in distinguishing executions can sometimes blow up the corpus. Given Echidna does it, it's probably worth doing (I wasn't aware of this when I implemented it fwiw).

if OK would follow up this with a different PR as I'd like to go more through foundry integration and implications / how to efficiently apply the depth

- Hitcount buffer now doubles when capacity is exceeded instead of silently dropping new edges. - Restore AFL++ NeverZero semantics: wrapping_add(1).max(1) so a counter that wraps past 255 lands on 1, not 0. This preserves the 'edge was hit' signal. Saturated counters were shown to be inferior for path discovery (see AFL++ docs). Amp-Thread-ID: https://ampcode.com/threads/T-019c528b-16c7-7700-8700-02529160df29 Co-authored-by: Amp <amp@ampcode.com>

gakonst and others added 2 commits February 12, 2026 16:54

fix(edge_cov): remove doc link to private constant

65a1086

Amp-Thread-ID: https://ampcode.com/threads/T-019c528b-16c7-7700-8700-02529160df29 Co-authored-by: Amp <amp@ampcode.com>

fix(edge_cov): restore #[cold] on do_step

2445c1e

Amp-Thread-ID: https://ampcode.com/threads/T-019c528b-16c7-7700-8700-02529160df29 Co-authored-by: Amp <amp@ampcode.com>

0xalpharush reviewed Feb 12, 2026

View reviewed changes

gakonst mentioned this pull request Feb 12, 2026

feat: collision-free sancov coverage for Tempo precompiles tempoxyz/tempo-foundry#263

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(edge_cov): collision-free dense edge IDs#404

feat(edge_cov): collision-free dense edge IDs#404
gakonst wants to merge 4 commits intomainfrom
alpharush/collision-free-edge-cov

gakonst commented Feb 12, 2026 •

edited

Loading

Uh oh!

gakonst commented Feb 12, 2026

Uh oh!

0xalpharush Feb 12, 2026

Uh oh!

grandizzy Feb 12, 2026

Uh oh!

0xalpharush Feb 12, 2026

Uh oh!

grandizzy Feb 12, 2026

Uh oh!

0xalpharush Feb 12, 2026 •

edited

Loading

Uh oh!

grandizzy Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

gakonst commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Perf

Testing

Uh oh!

gakonst commented Feb 12, 2026

Uh oh!

0xalpharush Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

grandizzy Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

0xalpharush Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

grandizzy Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

0xalpharush Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grandizzy Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gakonst commented Feb 12, 2026 •

edited

Loading

0xalpharush Feb 12, 2026 •

edited

Loading