Isolate rustc internals and make toolchain bumps less hurt by cds-amal · Pull Request #123 · runtimeverification/stable-mir-json

cds-amal · 2026-02-21T20:25:37Z

This PR builds on #121 (the pipeline refactor) with two goals: make it structurally easy to absorb rustc API changes, and remove the manual bookkeeping around which rustc commit our tests need.

This also addresses Issue #90 and supersedes PR #91 (see "Why this supersedes #91" below).

The compat layer (src/compat/). All usage of rustc internal APIs (rustc_middle, rustc_smir, rustc_span, etc.) is now routed through a single module. When rustc changes its internals (which it does regularly), fixes concentrate in compat/ rather than scattering across the codebase. We stress-tested this against two toolchain bumps (6-month and 13-month jumps); the decision, boundary, and validation results are documented in ADR-003.

This also fixes a gap in mk_graph/, which was using extern crate stable_mir directly (bypassing the abstraction). Those imports now go through compat like everything else.

Single source of truth for the rustc commit. This PR adds a [metadata] section to rust-toolchain.toml with the rustc-commit that backs our pinned nightly:

[metadata]
rustc-commit = "a2545fd6fc66b4323f555223a860c451885d1d2b"

But that created a gap: now the commit hash lives in rust-toolchain.toml and is still hardcoded in .github/workflows/test.yml (the actions/checkout ref for rust-lang/rust). Two sources of truth, one of which will inevitably drift. The fix for this is to make the TOML the single source of truth and have everything else read from it.

(rustup silently ignores unknown keys, so the [metadata] section is safe; it's purely for our scripts and CI.)

What changes

CI (test.yml)

The ui-tests job now installs yq (via mikefarah/yq-action), reads the commit hash from rust-toolchain.toml, and passes it to actions/checkout for rust-lang/rust. No more hardcoded hash in the workflow. If the metadata key is missing or empty, the step fails loudly with a non-zero exit.

Local test scripts (run_ui_tests.sh, remake_ui_tests.sh)

Both scripts now source a shared helper (ensure_rustc_commit.sh) that:

Reads metadata.rustc-commit from rust-toolchain.toml using yq
Detects whether RUST_DIR_ROOT points to a regular clone or a bare repo
For a regular clone: runs git checkout <commit> (or skips if already there)
For a bare repo: creates a detached worktree at $RUST_DIR/<short-hash>/ (or reuses it if one already exists at the right commit)

This means you no longer have to manually check out the right commit before running make test-ui or make remake-ui-tests. Just point RUST_DIR_ROOT at your rust checkout and the scripts handle the rest.

yq (the Go version, mikefarah/yq) is now required for UI tests. If it's missing, the script tells you how to install it and exits.

A note on worktrees for the rust checkout

Rust's repo is enormous. When you inevitably have to switch between commits (toolchain bumps, bisecting, etc.), git checkout in a regular clone rewrites hundreds of thousands of files, blows away build caches, and generally makes you wait while incremental builds start over from scratch. Not great.

A bare repo with worktrees avoids all of this. Each worktree shares the same object store but keeps its own working tree and build artifacts. Switching between commits is just a cd; no files get rewritten, no caches get invalidated. If you've already built at one commit, you can hop to another worktree and come back without losing anything.

To convert an existing rust clone to bare+worktree:

cd ..
mv rust rust-tmp
git clone --bare git@github.com:rust-lang/rust.git rust
cd rust

# Create a worktree for the commit our tests need
git worktree add ./a2545fd6fc66 a2545fd6fc66b4323f555223a860c451885d1d2b --detach

Then point the test scripts at the repo root:

RUST_DIR_ROOT=~/oss/rust make test-ui

The scripts detect the bare repo and create (or reuse) the worktree automatically. When the toolchain bumps to a new nightly, a new worktree gets created at the new commit; the old one sticks around and you can prune it whenever you feel like it.

Why this supersedes PR #91

PR #91 and this PR both address Issue #90 (upgrading the rustc dependency to a recent nightly), but they take fundamentally different approaches.

The core difference: this PR creates an adapter API (src/compat/) that gates all rustc access through a single module, so that API churn is absorbed in one place. PR #91 patches every call site across the codebase directly.

When rustc changes its internal APIs, the cost of adapting depends on how many files touch those APIs. In the codebase as it existed when #91 was written, rustc internals were scattered everywhere: printer.rs, mk_graph/, driver.rs, and various helpers all had their own extern crate declarations and direct imports. Every file that directly calls a rustc API is a file that breaks when that API changes.

PR #91 updates the toolchain from nightly-2024-11-29 to nightly-2025-07-20 and fixes every call site that broke. The result works, but the changes scatter across printer.rs, mk_graph.rs, driver.rs, Cargo.toml, the jq filter, and every golden test file. The next toolchain bump would require the same scattershot patching, because nothing has changed about how the codebase interfaces with rustc.

This PR invests in the adapter layer first. To make the difference concrete, here's what the stable_mir to rustc_public crate rename looks like in each approach:

PR #91's approach: update every file that declares extern crate stable_mir or imports from it: printer.rs, all 5 mk_graph/ files, and driver.rs.

This PR's approach: change one line in compat/mod.rs:

// before
pub extern crate stable_mir;
// after
pub extern crate rustc_public as stable_mir;

Every consumer imports via use crate::compat::stable_mir, so nothing else changes.

This PR doesn't bump the toolchain yet. It lays the groundwork so that the bump, when it lands, is a contained change in compat/ and driver.rs rather than a codebase-wide patch. The spike branches (spike/toolchain-2025-06 and spike/toolchain-2026-01) demonstrate this on real toolchain jumps. PR #91 should be closed once this merges; the upgrade itself can follow as a focused PR that only touches the compat layer.

The existing integration tests normalize away alloc_ids (via the jq filter), so they cannot catch the class of bug fixed in f903f21: where the analysis phase walks a different body than what gets stored in the output, producing AllocIds that don't correspond to any collected allocation. There is no red test for that fix. AllocMap replaces the bare HashMap<AllocId, ...> type alias with a newtype that, under #[cfg(debug_assertions)], tracks every insertion and flags duplicates. After the collect/analyze phase completes, verify_coherence walks every stored Item body with an AllocIdCollector visitor and asserts that each referenced AllocId exists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time, without any cost in release builds. collect_alloc is refactored from the entry() API to contains_key() + insert() so all mutations flow through AllocMap::insert and get tracked. This commit is designed to be rebased before the fix so the assertion fires on the broken code and goes green after each subsequent commit.

Before ------ The pipeline had three separate passes over MIR bodies, each with access to the full TyCtxt (and therefore the ability to call side-effecting rustc queries like inst.body()): 1. mk_item: calls inst.body(), stores body in Item 2. collect_unevaluated_constant_items: walks stored bodies to find unevaluated consts (separate UnevaluatedConstCollector visitor) 3. collect_interned_values: walks stored bodies again to find allocs, types, spans, and calls (separate InternedValueCollector visitor) Every phase could freely call inst.body() or other allocating rustc queries, and nothing in the types prevented it. The alloc-id bug fixed in the prior commit was a direct consequence: collect_interned_values called inst.body() a second time, causing rustc to mint fresh AllocIds that didn't match the ones already embedded in the stored Body. The same pattern existed in two other places: get_item_details (debug mode) called inst.body() again, and the MonoItem::Static arm in collect_interned_values called inst.body() on statics that had no pre-collected body. collect_smir itself was a flat script that threaded mutable state between these phases, including a full HashMap clone (items_clone) just to remember which items were originally collected. After ----- Side-effecting rustc queries (inst.body(), eval_initializer()) are now confined to a single function: mk_item. Everything downstream operates on pre-collected data, and the phase boundary is enforced by the type signatures: collect_and_analyze_items(&[Item]) -> (CollectedCrate, DerivedInfo) assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson CollectedCrate holds the items and unevaluated consts (output of rustc interaction). DerivedInfo holds calls, allocs, types, and spans (output of body analysis). assemble_smir takes both by value and does pure data transformation; it structurally cannot call inst.body() because it has no MonoItem or Instance to call it on. The two body-walking visitors (InternedValueCollector and UnevaluatedConstCollector) are merged into a single BodyAnalyzer that walks each body exactly once. The fixpoint loop for transitive unevaluated constant discovery is integrated: when BodyAnalyzer finds an unevaluated const, it records it as a UnevalConstInfo; the outer loop creates the new Item (the one place inst.body() is called) and enqueues it. Static items now store their body in MonoItemKind::MonoItemStatic (collected once in mk_item), so the analysis phase never needs to go back to rustc for static bodies either. The full HashMap clone (items_clone) is replaced by a HashSet of original item names, which is all the static fixup actually needs. get_item_details now takes the pre-collected body as a parameter instead of calling inst.body() independently. Composability: the previous pipeline was a single monolithic function where each phase's output was wired directly into the next via local variables. The new structure exposes CollectedCrate and DerivedInfo as independent values that can be inspected, tested, or transformed separately. Deleted: InternedValueCollector, UnevaluatedConstCollector, collect_interned_values, collect_unevaluated_constant_items, the InternedValues type alias, and the items_clone. All 28 integration tests produce identical output.

Concentrate all rustc-internal API calls into src/compat/ so that toolchain upgrades touch one module (plus driver.rs) instead of the whole codebase. New module layout: src/compat/mod.rs - centralized extern crate + re-exports src/compat/bridge.rs - OpaqueInstanceKind, internal conversions src/compat/mono_collect.rs - mono item collection and symbol naming src/compat/spans.rs - span-to-source-location resolution src/compat/types.rs - type queries (generics, fn sigs, attrs) src/compat/output.rs - output filename resolution Key change: replace middle::ty::InstanceKind<'tcx> with an owned OpaqueInstanceKind { debug_repr, is_reify_shim }, eliminating the 'tcx lifetime parameter from SmirJson, LinkMapKey, FnSymInfo, LinkMap, DerivedInfo, and SmirJsonDebugInfo. After this, printer.rs has zero extern crate rustc_* declarations and zero direct tcx.query() calls.

Replace the hardcoded commit hash in CI and manual checkout steps with automated reads from rust-toolchain.toml's [metadata] section. Both local test scripts and CI now use yq to parse the commit, and the scripts handle bare repos (via worktrees) and regular clones. Also converts lessons-learned.md into ADR-001 (compat layer decision).

cds-amal · 2026-02-22T15:45:39Z

@jberthold , would love your feedback if you have the time.

jberthold

This is a quite substantial refactoring, and definitely worth merging.
While possibly the updates to new rustc versions won't happen too frequently, the proposed code has much cleaner structure.
I did not go through the many details of the changes and it is still in draft but this LGTM (very good).

jberthold · 2026-02-23T11:26:54Z

src/printer.rs

 fn mono_item_name(tcx: TyCtxt<'_>, item: &MonoItem) -> String {
-    if let MonoItem::GlobalAsm(data) = item {
-        hash(data).to_string()
-    } else {
-        mono_item_name_int(tcx, &rustc_internal::internal(tcx, item))
-    }
-}
-
-fn mono_item_name_int<'a>(tcx: TyCtxt<'a>, item: &rustc_middle::mir::mono::MonoItem<'a>) -> String {
-    item.symbol_name(tcx).name.into()
+    crate::compat::mono_collect::mono_item_name(tcx, item)


I read a few of these changes that replace a previous helper function with a call into a compat module, like fn blah(args..) -> .. { crate::compat::...::blah(args..) (sometimes with additional struct wrapper like the GenericData above).
Maybe these calls could be made directly through to compat::...?

This is a great catch, @jberthold . I was focused on wrapping rustc calls behind a managed API and missed these, which are very much in the spirit of limiting the blast radius. Thanks for pointing it out.

Consider this the "Spirit of Radio" pass, tightening the signal and cutting the noise. 🎸

Jost rightly pointed out that several functions in printer.rs were just forwarding to crate::compat::* with no additional logic: generic_data, has_attr, mono_item_name, def_id_to_inst, and mono_collect. Callers can (and should) go through the compat boundary directly instead. Five wrappers removed; call sites now import from compat or call inline. has_attr moves to a pub re-export from compat::types in lib.rs. Also drops a stale comment referencing a lifetime-carrying GenericData<'a> alternative that the compat refactor made irrelevant.

Expand the module-level docs in compat/mod.rs to spell out the rule: nothing outside compat/ (and driver.rs) should touch rustc internals directly. Add a submodule reference table, explain the re-exports, and document each pub use item so cargo doc --open tells a coherent story. Also add a doc comment to SourceData in spans.rs.

dkcumming · 2026-02-26T08:26:34Z

After #121 this will need an update, but I think this is ready to go when that has happened. Looks like a great improvement

cds-amal and others added 9 commits February 19, 2026 18:45

Avoid inst.body() duplicate call by carrying collected items around

1b5e4b0

Add lessons-learned from toolchain bump stress test

e10af81

Route mk_graph stable_mir imports through compat module

307dcb8

Add rustc commit hash to rust-toolchain.toml for UI tests

70e1c4f

Rename ADR

ea11d5f

cds-amal force-pushed the spike/hex-rustc branch from 8cbf015 to ea11d5f Compare February 21, 2026 20:29

jberthold reviewed Feb 23, 2026

View reviewed changes

cds-amal added 2 commits February 23, 2026 10:28

dkcumming self-assigned this Feb 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Isolate rustc internals and make toolchain bumps less hurt#123

Isolate rustc internals and make toolchain bumps less hurt#123
cds-amal wants to merge 11 commits intoruntimeverification:masterfrom
cds-rs:spike/hex-rustc

cds-amal commented Feb 21, 2026 •

edited

Loading

Uh oh!

cds-amal commented Feb 22, 2026

Uh oh!

jberthold left a comment

Uh oh!

jberthold Feb 23, 2026

Uh oh!

cds-amal Feb 23, 2026

Uh oh!

dkcumming commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cds-amal commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes

A note on worktrees for the rust checkout

Why this supersedes PR #91

Uh oh!

cds-amal commented Feb 22, 2026

Uh oh!

jberthold left a comment

Choose a reason for hiding this comment

Uh oh!

jberthold Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

cds-amal Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

dkcumming commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cds-amal commented Feb 21, 2026 •

edited

Loading