Skip to content

Smaller epsilon towards determinism#125

Draft
cds-amal wants to merge 3 commits intoruntimeverification:masterfrom
cds-rs:feat/determinism
Draft

Smaller epsilon towards determinism#125
cds-amal wants to merge 3 commits intoruntimeverification:masterfrom
cds-rs:feat/determinism

Conversation

@cds-amal
Copy link
Collaborator

@cds-amal cds-amal commented Feb 22, 2026

This PR replaces all index-based sort keys with content-derived ones, making the ordering of output vectors deterministic across runs. The one remaining source of cross-run non-determinism is adt_def (see below for why that can't be fixed here).

Context

For those unfamiliar: rustc stores types, allocations, and other values in central tables and refers to them by integer index. These indices are assigned in the order the compiler happens to intern items, which is not guaranteed to be the same between invocations. There are two independent reasons for this:

  1. Hash-map iteration order: Rustc uses hash maps heavily (often FxHashMap for speed rather than the standard library's RandomState-backed HashMap). Hash-map iteration order is not specified and depends on hash/table layout and insertion history. If the order of insertions varies (because upstream work completed in a different order, or because of platform or compiler version differences), then iterating the map and interning items in iteration order produces different interned indices.
  2. Parallel query evaluation: Rustc uses rayon for parallel compilation. Query evaluation, monomorphization collection, and codegen unit partitioning can all run work items concurrently. The order in which parallel tasks complete depends on OS thread scheduling, which can affect insertion order into those hash maps. So if type A gets interned before type B in one run because its query finished first, their indices may swap in the next run.

Either source alone is sufficient to produce different indices across runs.

The output vectors in collect_smir() were sorted like this:

allocs.sort_by(|a, b| a.alloc_id.to_index().cmp(&b.alloc_id.to_index()));
functions.sort_by(|a, b| a.0 .0.to_index().cmp(&b.0 .0.to_index()));
types.sort_by(|a, b| a.0.to_index().cmp(&b.0.to_index()));
spans.sort();

These .to_index() calls return those interned IDs. The original comment said "stabilise output (a bit)", which is honest about the fact that this was only partially working. And uneval_consts (coming from HashMap::into_iter()) wasn't sorted at all.

The integration test harness worked around all of this with a jq normalization filter (normalise-filter.jq) that strips unstable IDs and re-sorts by content before diffing. But the raw JSON output itself was not reproducible.

What changed

All changes are in src/printer.rs, in the sorting section of collect_smir(). Here's the new sort strategy for each vector:

Vector Old sort key New sort key Tiebreaker
functions Ty.to_index() Ty display string (via ty_pretty) interned index
types Ty.to_index() Ty display string (via ty_pretty) interned index
allocs AllocId.to_index() content-derived string (see below) none needed
spans span index (opaque) location tuple: (filename, lo_line, lo_col, hi_line, hi_col) none needed
uneval_consts unsorted item name string none needed
items unchanged already uses a content-based Ord impl N/A

Allocation sort keys deserve some detail. The new alloc_sort_key() helper produces a content-derived string from each AllocInfo by matching on the GlobalAlloc variant:

Variant Sort key format Example
Memory 0_Memory_ + zero-padded byte length "0_Memory_00000000000000000032"
Static 1_Static_ + def name "1_Static_MY_STATIC"
VTable 2_VTable_ + type string "2_VTable_dyn Trait"
Function 3_Function_ + instance name "3_Function_foo::bar"

The numeric prefix groups entries by variant kind. Byte length is zero-padded to 20 digits so that 32 sorts before 128 lexicographically. (Span locations are usize values that compare numerically, so no padding is needed there.)

Three golden test files were regenerated because their normalized output changed due to the new ordering. The jq normalization filter doesn't fully sort TupleType and FunType entries, so those were sensitive to input order.

Remaining non-determinism: adt_def

We looked into stabilizing the adt_def field on EnumType/StructType/UnionType as well; it's another interned index that the jq filter strips for test normalization. Turns out it can't be dropped or replaced.

The reason: downstream consumers need adt_def as a cross-reference key to match AggregateKind::Adt(adt_def, ...) in MIR bodies with the corresponding type metadata entry. AggregateKind serialization comes from stable_mir (we don't control it), so both sides of the join have to use the same key format. The index is consistent within a single JSON file; it's just not stable across runs.

So adt_def remains the one known source of cross-run non-determinism in the output, and the jq filter still needs to strip it for golden test comparison. A comment was added on the field explaining this constraint. If we ever get the ability to customize AggregateKind serialization upstream, we could replace these indices with names, but that's a stable_mir change, not something we can do on our end. (See PR #64 for the full discussion.)

On the Ty display string tiebreaker

Ty's display impl goes through ty_pretty(), which calls to_string() on rustc's internal Ty. For monomorphized types (all generics resolved, no lifetime parameters), this should be injective: distinct types produce distinct strings. But there's a theoretical concern: could two distinct Ty values display identically? Perhaps some obscure lifetime or where-clause difference that gets elided in the display output.

We weren't confident enough to rule this out entirely, so the interned index is used as a tiebreaker: content-based ordering for the common case, with the index preserving a consistent (within-run) ordering for any hypothetical ties. If a tie does occur, the ordering at the tie point would be non-deterministic across runs, but we haven't observed this in practice.

What this means in practice

If you run stable-mir-json twice on the same input file and diff the raw JSON (no jq filter), the only differences will be the interned index values themselves (adt_def, alloc_id, Ty keys, etc.). The structural ordering of every output vector is now identical across runs. Before this change, even the order of types, functions, allocs, and spans could shuffle around.

The jq normalization filter is still needed for golden tests (to strip those interned IDs), but the sort operations in it are now redundant; they just confirm what the source already guarantees.

Performance

The format!("{}", ty) calls for sorting functions and types allocate strings. This is fine; the vectors are small (one entry per unique type or function in the program), and this code runs once at the end after all collection is done.

Test plan

  • cargo build compiles cleanly
  • make integration-test passes (all 28 tests)
  • Ran stable-mir-json twice on assert_eq.rs and diffed the raw JSON: identical modulo interned index values (adt_def only); all vector ordering matched exactly

…tput

Output vectors (functions, types, allocs, spans) were sorted by rustc's
interned indices (Ty.to_index(), AllocId.to_index()), which change between
compiler runs. Replace these with content-based sort keys: type display
strings for functions/types, a variant+content key for allocs, and
location tuples for spans. Also sort uneval_consts (previously unsorted).

For functions and types, the interned index is kept as a tiebreaker in
case two distinct types produce the same display string.

Regenerate golden files for the 3 tests affected by the new ordering.
The Makefile ran `cargo clippy` without flags, so clippy warnings
that CI treats as errors (via `-Dwarnings`) passed locally but
failed in CI.
Replace sort_by with sort_by_key(alloc_sort_key), which clippy
further simplifies by passing the function directly without a
closure wrapper.
@cds-amal cds-amal mentioned this pull request Feb 28, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant