Isolate rustc internals and make toolchain bumps less hurt#123
Isolate rustc internals and make toolchain bumps less hurt#123cds-amal wants to merge 11 commits intoruntimeverification:masterfrom
Conversation
The existing integration tests normalize away alloc_ids (via the jq filter), so they cannot catch the class of bug fixed in f903f21: where the analysis phase walks a different body than what gets stored in the output, producing AllocIds that don't correspond to any collected allocation. There is no red test for that fix. AllocMap replaces the bare HashMap<AllocId, ...> type alias with a newtype that, under #[cfg(debug_assertions)], tracks every insertion and flags duplicates. After the collect/analyze phase completes, verify_coherence walks every stored Item body with an AllocIdCollector visitor and asserts that each referenced AllocId exists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time, without any cost in release builds. collect_alloc is refactored from the entry() API to contains_key() + insert() so all mutations flow through AllocMap::insert and get tracked. This commit is designed to be rebased before the fix so the assertion fires on the broken code and goes green after each subsequent commit.
Before
------
The pipeline had three separate passes over MIR bodies, each with
access to the full TyCtxt (and therefore the ability to call
side-effecting rustc queries like inst.body()):
1. mk_item: calls inst.body(), stores body in Item
2. collect_unevaluated_constant_items: walks stored bodies to find
unevaluated consts (separate UnevaluatedConstCollector visitor)
3. collect_interned_values: walks stored bodies again to find allocs,
types, spans, and calls (separate InternedValueCollector visitor)
Every phase could freely call inst.body() or other allocating rustc
queries, and nothing in the types prevented it. The alloc-id bug fixed
in the prior commit was a direct consequence: collect_interned_values
called inst.body() a second time, causing rustc to mint fresh AllocIds
that didn't match the ones already embedded in the stored Body. The
same pattern existed in two other places: get_item_details (debug mode)
called inst.body() again, and the MonoItem::Static arm in
collect_interned_values called inst.body() on statics that had no
pre-collected body.
collect_smir itself was a flat script that threaded mutable state
between these phases, including a full HashMap clone (items_clone) just
to remember which items were originally collected.
After
-----
Side-effecting rustc queries (inst.body(), eval_initializer()) are now
confined to a single function: mk_item. Everything downstream operates
on pre-collected data, and the phase boundary is enforced by the type
signatures:
collect_and_analyze_items(&[Item]) -> (CollectedCrate, DerivedInfo)
assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson
CollectedCrate holds the items and unevaluated consts (output of rustc
interaction). DerivedInfo holds calls, allocs, types, and spans (output
of body analysis). assemble_smir takes both by value and does pure data
transformation; it structurally cannot call inst.body() because it has
no MonoItem or Instance to call it on.
The two body-walking visitors (InternedValueCollector and
UnevaluatedConstCollector) are merged into a single BodyAnalyzer that
walks each body exactly once. The fixpoint loop for transitive
unevaluated constant discovery is integrated: when BodyAnalyzer finds
an unevaluated const, it records it as a UnevalConstInfo; the outer
loop creates the new Item (the one place inst.body() is called) and
enqueues it.
Static items now store their body in MonoItemKind::MonoItemStatic
(collected once in mk_item), so the analysis phase never needs to go
back to rustc for static bodies either.
The full HashMap clone (items_clone) is replaced by a HashSet of
original item names, which is all the static fixup actually needs.
get_item_details now takes the pre-collected body as a parameter
instead of calling inst.body() independently.
Composability: the previous pipeline was a single monolithic function
where each phase's output was wired directly into the next via local
variables. The new structure exposes CollectedCrate and DerivedInfo as
independent values that can be inspected, tested, or transformed
separately.
Deleted: InternedValueCollector, UnevaluatedConstCollector,
collect_interned_values, collect_unevaluated_constant_items, the
InternedValues type alias, and the items_clone.
All 28 integration tests produce identical output.
Concentrate all rustc-internal API calls into src/compat/ so that
toolchain upgrades touch one module (plus driver.rs) instead of
the whole codebase.
New module layout:
src/compat/mod.rs - centralized extern crate + re-exports
src/compat/bridge.rs - OpaqueInstanceKind, internal conversions
src/compat/mono_collect.rs - mono item collection and symbol naming
src/compat/spans.rs - span-to-source-location resolution
src/compat/types.rs - type queries (generics, fn sigs, attrs)
src/compat/output.rs - output filename resolution
Key change: replace middle::ty::InstanceKind<'tcx> with an owned
OpaqueInstanceKind { debug_repr, is_reify_shim }, eliminating the
'tcx lifetime parameter from SmirJson, LinkMapKey, FnSymInfo,
LinkMap, DerivedInfo, and SmirJsonDebugInfo.
After this, printer.rs has zero extern crate rustc_* declarations
and zero direct tcx.query() calls.
Replace the hardcoded commit hash in CI and manual checkout steps with automated reads from rust-toolchain.toml's [metadata] section. Both local test scripts and CI now use yq to parse the commit, and the scripts handle bare repos (via worktrees) and regular clones. Also converts lessons-learned.md into ADR-001 (compat layer decision).
8cbf015 to
ea11d5f
Compare
|
@jberthold , would love your feedback if you have the time. |
jberthold
left a comment
There was a problem hiding this comment.
This is a quite substantial refactoring, and definitely worth merging.
While possibly the updates to new rustc versions won't happen too frequently, the proposed code has much cleaner structure.
I did not go through the many details of the changes and it is still in draft but this LGTM (very good).
src/printer.rs
Outdated
| fn mono_item_name(tcx: TyCtxt<'_>, item: &MonoItem) -> String { | ||
| if let MonoItem::GlobalAsm(data) = item { | ||
| hash(data).to_string() | ||
| } else { | ||
| mono_item_name_int(tcx, &rustc_internal::internal(tcx, item)) | ||
| } | ||
| } | ||
|
|
||
| fn mono_item_name_int<'a>(tcx: TyCtxt<'a>, item: &rustc_middle::mir::mono::MonoItem<'a>) -> String { | ||
| item.symbol_name(tcx).name.into() | ||
| crate::compat::mono_collect::mono_item_name(tcx, item) |
There was a problem hiding this comment.
I read a few of these changes that replace a previous helper function with a call into a compat module, like fn blah(args..) -> .. { crate::compat::...::blah(args..) (sometimes with additional struct wrapper like the GenericData above).
Maybe these calls could be made directly through to compat::...?
There was a problem hiding this comment.
This is a great catch, @jberthold . I was focused on wrapping rustc calls behind a managed API and missed these, which are very much in the spirit of limiting the blast radius. Thanks for pointing it out.
Consider this the "Spirit of Radio" pass, tightening the signal and cutting the noise. 🎸
Jost rightly pointed out that several functions in printer.rs were just forwarding to crate::compat::* with no additional logic: generic_data, has_attr, mono_item_name, def_id_to_inst, and mono_collect. Callers can (and should) go through the compat boundary directly instead. Five wrappers removed; call sites now import from compat or call inline. has_attr moves to a pub re-export from compat::types in lib.rs. Also drops a stale comment referencing a lifetime-carrying GenericData<'a> alternative that the compat refactor made irrelevant.
Expand the module-level docs in compat/mod.rs to spell out the rule: nothing outside compat/ (and driver.rs) should touch rustc internals directly. Add a submodule reference table, explain the re-exports, and document each pub use item so cargo doc --open tells a coherent story. Also add a doc comment to SourceData in spans.rs.
|
After #121 this will need an update, but I think this is ready to go when that has happened. Looks like a great improvement |
This PR builds on #121 (the pipeline refactor) with two goals: make it structurally easy to absorb rustc API changes, and remove the manual bookkeeping around which rustc commit our tests need.
This also addresses Issue #90 and supersedes PR #91 (see "Why this supersedes #91" below).
The compat layer (
src/compat/). All usage of rustc internal APIs (rustc_middle,rustc_smir,rustc_span, etc.) is now routed through a single module. When rustc changes its internals (which it does regularly), fixes concentrate incompat/rather than scattering across the codebase. We stress-tested this against two toolchain bumps (6-month and 13-month jumps); the decision, boundary, and validation results are documented inADR-003.This also fixes a gap in
mk_graph/, which was usingextern crate stable_mirdirectly (bypassing the abstraction). Those imports now go throughcompatlike everything else.Single source of truth for the rustc commit. This PR adds a
[metadata]section torust-toolchain.tomlwith therustc-committhat backs our pinned nightly:But that created a gap: now the commit hash lives in
rust-toolchain.tomland is still hardcoded in.github/workflows/test.yml(theactions/checkoutref forrust-lang/rust). Two sources of truth, one of which will inevitably drift. The fix for this is to make the TOML the single source of truth and have everything else read from it.(rustup silently ignores unknown keys, so the
[metadata]section is safe; it's purely for our scripts and CI.)What changes
CI (
test.yml)The
ui-testsjob now installsyq(viamikefarah/yq-action), reads the commit hash fromrust-toolchain.toml, and passes it toactions/checkoutforrust-lang/rust. No more hardcoded hash in the workflow. If the metadata key is missing or empty, the step fails loudly with a non-zero exit.Local test scripts (
run_ui_tests.sh,remake_ui_tests.sh)Both scripts now source a shared helper (
ensure_rustc_commit.sh) that:metadata.rustc-commitfromrust-toolchain.tomlusingyqRUST_DIR_ROOTpoints to a regular clone or a bare repogit checkout <commit>(or skips if already there)$RUST_DIR/<short-hash>/(or reuses it if one already exists at the right commit)This means you no longer have to manually check out the right commit before running
make test-uiormake remake-ui-tests. Just pointRUST_DIR_ROOTat your rust checkout and the scripts handle the rest.yq(the Go version, mikefarah/yq) is now required for UI tests. If it's missing, the script tells you how to install it and exits.A note on worktrees for the rust checkout
Rust's repo is enormous. When you inevitably have to switch between commits (toolchain bumps, bisecting, etc.),
git checkoutin a regular clone rewrites hundreds of thousands of files, blows away build caches, and generally makes you wait while incremental builds start over from scratch. Not great.A bare repo with worktrees avoids all of this. Each worktree shares the same object store but keeps its own working tree and build artifacts. Switching between commits is just a
cd; no files get rewritten, no caches get invalidated. If you've already built at one commit, you can hop to another worktree and come back without losing anything.To convert an existing rust clone to bare+worktree:
Then point the test scripts at the repo root:
RUST_DIR_ROOT=~/oss/rust make test-uiThe scripts detect the bare repo and create (or reuse) the worktree automatically. When the toolchain bumps to a new nightly, a new worktree gets created at the new commit; the old one sticks around and you can prune it whenever you feel like it.
Why this supersedes PR #91
PR #91 and this PR both address Issue #90 (upgrading the rustc dependency to a recent nightly), but they take fundamentally different approaches.
The core difference: this PR creates an adapter API (
src/compat/) that gates all rustc access through a single module, so that API churn is absorbed in one place. PR #91 patches every call site across the codebase directly.When rustc changes its internal APIs, the cost of adapting depends on how many files touch those APIs. In the codebase as it existed when #91 was written, rustc internals were scattered everywhere:
printer.rs,mk_graph/,driver.rs, and various helpers all had their ownextern cratedeclarations and direct imports. Every file that directly calls a rustc API is a file that breaks when that API changes.PR #91 updates the toolchain from
nightly-2024-11-29tonightly-2025-07-20and fixes every call site that broke. The result works, but the changes scatter acrossprinter.rs,mk_graph.rs,driver.rs,Cargo.toml, the jq filter, and every golden test file. The next toolchain bump would require the same scattershot patching, because nothing has changed about how the codebase interfaces with rustc.This PR invests in the adapter layer first. To make the difference concrete, here's what the
stable_mirtorustc_publiccrate rename looks like in each approach:PR #91's approach: update every file that declares
extern crate stable_miror imports from it:printer.rs, all 5mk_graph/files, anddriver.rs.This PR's approach: change one line in
compat/mod.rs:Every consumer imports via
use crate::compat::stable_mir, so nothing else changes.This PR doesn't bump the toolchain yet. It lays the groundwork so that the bump, when it lands, is a contained change in
compat/anddriver.rsrather than a codebase-wide patch. The spike branches (spike/toolchain-2025-06andspike/toolchain-2026-01) demonstrate this on real toolchain jumps. PR #91 should be closed once this merges; the upgrade itself can follow as a focused PR that only touches the compat layer.