Add post-mono MIR optimizations #131650

saethlin · 2024-10-13T12:45:35Z

Before this PR, all MIR passes had to operate on polymorphic MIR. Thus any MIR transform maybe unable to determine the type of an argument or local (because it's still generic) or it may be unable to determine which function a Call terminator is calling (because it's still generic).

MIR transforms are a highly maintainable solution to a number of compiler problems, but this polymorphic limitation means that they are cannot solve some of our problems that we'd like them to; the most recent examples that come to mind are #134082 which has extra limitations because of the polymorphic inliner, and #139088 which is explicitly waiting for post-mono MIR passes to happen.

In addition, the lack of post-mono MIR optimizations means that MIR optimizations just miss out on profitable optimizations, which are so valuable that we've added kludges like #121421 (a MIR traversal that you better only run at mono-time).

In addition, rustc_codegen_ssa is riddled with on-the-fly monomorphization and optimization; the logic for these tricks that we do during codegen in my experience are hard to maintain, and I would much rather have those implemented in a MIR transform.

So this PR adds a new query codegen_mir (the MIR for codegen, not that I like the name). I've then replaced some of the kludges in rustc_codegen_ssa with PostMono variants of existing MIR transforms.

I've also un-querified check_mono_item and put it at the end of the post-mono pass list. Those checks should be post-mono passes too, but I've tried to keep this PR to a reviewable size. It's easy to imagine lots of other places to use post-mono MIR opts and I want the usefulness of this to be clear while the diff is also manageable.

This PR has a perf regression. I've hammered on the perf in a number of ways to get it down to what it is. incr-full builds suffer the most because they need to clone, intern, and cache a monomorphized copy of every MIR body. Things are mixed for every other build scenario. In almost all cases, binary sizes improve.

saethlin · 2024-10-13T20:06:42Z

@bors try @rust-timer queue

bors · 2024-10-13T20:07:52Z

⌛ Trying commit a211812 with merge b141564...

Add post-mono MIR passes to make mono-reachable analysis more accurate r? ghost

bors · 2024-10-13T21:57:30Z

☀️ Try build successful - checks-actions
Build commit: b141564 (b1415647cdfcdd1b8dc5ed5f9a5aba87ade0b225)

rust-timer · 2024-10-13T23:17:40Z

Finished benchmarking commit (b141564): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	12.2%	[0.2%, 93.7%]	163
Regressions ❌ (secondary)	6.9%	[0.2%, 266.3%]	119
Improvements ✅ (primary)	-0.7%	[-3.0%, -0.2%]	6
Improvements ✅ (secondary)	-11.1%	[-33.8%, -0.2%]	12
All ❌✅ (primary)	11.7%	[-3.0%, 93.7%]	169

Max RSS (memory usage)

Results (primary 14.5%, secondary 1.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	14.5%	[0.7%, 56.9%]	108
Regressions ❌ (secondary)	4.5%	[0.6%, 12.8%]	34
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-22.2%	[-24.2%, -19.2%]	4
All ❌✅ (primary)	14.5%	[0.7%, 56.9%]	108

Cycles

Results (primary 22.8%, secondary 13.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	23.0%	[0.8%, 108.5%]	111
Regressions ❌ (secondary)	19.4%	[1.0%, 223.4%]	42
Improvements ✅ (primary)	-3.0%	[-3.0%, -3.0%]	1
Improvements ✅ (secondary)	-33.2%	[-42.8%, -1.3%]	5
All ❌✅ (primary)	22.8%	[-3.0%, 108.5%]	112

Binary size

Results (primary -0.3%, secondary -2.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.8%	[0.0%, 2.3%]	7
Regressions ❌ (secondary)	0.1%	[0.1%, 0.1%]	1
Improvements ✅ (primary)	-0.4%	[-1.7%, -0.0%]	76
Improvements ✅ (secondary)	-2.4%	[-25.8%, -0.0%]	65
All ❌✅ (primary)	-0.3%	[-1.7%, 2.3%]	83

Bootstrap: 781.427s -> 807.023s (3.28%)
Artifact size: 331.96 MiB -> 332.21 MiB (0.08%)

saethlin · 2024-10-14T06:14:00Z

@bors try @rust-timer queue

bors · 2024-10-14T06:15:11Z

⌛ Trying commit 6f6737a with merge 9233d9f...

Add post-mono MIR passes to make mono-reachable analysis more accurate r? ghost

bors · 2024-10-14T08:05:01Z

☀️ Try build successful - checks-actions
Build commit: 9233d9f (9233d9f83ca672be3b2cfa697806fdb7c8970490)

rust-timer · 2024-10-14T09:23:57Z

Finished benchmarking commit (9233d9f): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	7.6%	[0.1%, 59.9%]	151
Regressions ❌ (secondary)	2.9%	[0.2%, 18.7%]	107
Improvements ✅ (primary)	-3.0%	[-3.0%, -3.0%]	1
Improvements ✅ (secondary)	-6.6%	[-64.0%, -0.3%]	11
All ❌✅ (primary)	7.5%	[-3.0%, 59.9%]	152

Max RSS (memory usage)

Results (primary 11.3%, secondary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	12.9%	[1.3%, 52.1%]	93
Regressions ❌ (secondary)	3.6%	[2.2%, 5.9%]	10
Improvements ✅ (primary)	-2.7%	[-4.3%, -0.8%]	10
Improvements ✅ (secondary)	-3.4%	[-3.5%, -3.4%]	2
All ❌✅ (primary)	11.3%	[-4.3%, 52.1%]	103

Cycles

Results (primary 10.6%, secondary 3.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	10.7%	[1.0%, 50.1%]	94
Regressions ❌ (secondary)	5.4%	[1.7%, 18.4%]	37
Improvements ✅ (primary)	-3.1%	[-3.1%, -3.1%]	1
Improvements ✅ (secondary)	-17.2%	[-62.3%, -1.6%]	4
All ❌✅ (primary)	10.6%	[-3.1%, 50.1%]	95

Binary size

Results (primary -0.1%, secondary -0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.0%, 2.4%]	9
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.8%, -0.0%]	69
Improvements ✅ (secondary)	-0.3%	[-0.8%, -0.0%]	51
All ❌✅ (primary)	-0.1%	[-0.8%, 2.4%]	78

Bootstrap: 782.104s -> 806.252s (3.09%)
Artifact size: 332.57 MiB -> 332.81 MiB (0.07%)

saethlin · 2024-10-24T02:45:21Z

@bors try @rust-timer queue

Add post-mono MIR passes to make mono-reachable analysis more accurate As of rust-lang#131650 (comment) I believe most of the incr overhead comes from re-computing, re-encoding, and loading a lot more MIR when all we're actually doing is traversing through it. I think that can be addressed by caching a query that looks up the mentioned/used items for an Instance. I think the full-build regressions are pretty much just the expense of cloning, then monomorphizing, then caching the MIR.

bors · 2024-10-24T02:46:32Z

⌛ Trying commit 4ae3542 with merge 174810c...

rust-timer · 2025-06-03T00:43:15Z

Finished benchmarking commit (5893003): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.2%, 4.9%]	53
Regressions ❌ (secondary)	1.2%	[0.1%, 18.4%]	30
Improvements ✅ (primary)	-0.8%	[-1.6%, -0.2%]	3
Improvements ✅ (secondary)	-1.1%	[-1.5%, -0.8%]	3
All ❌✅ (primary)	0.7%	[-1.6%, 4.9%]	56

Max RSS (memory usage)

Results (primary 7.0%, secondary 1.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	7.0%	[1.1%, 31.1%]	36
Regressions ❌ (secondary)	1.7%	[0.4%, 4.2%]	16
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.1%	[-2.6%, -0.4%]	5
All ❌✅ (primary)	7.0%	[1.1%, 31.1%]	36

Cycles

Results (primary 1.8%, secondary 3.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.8%	[0.9%, 5.0%]	22
Regressions ❌ (secondary)	3.6%	[0.5%, 17.0%]	15
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.5%, -0.4%]	2
All ❌✅ (primary)	1.8%	[0.9%, 5.0%]	22

Binary size

Results (primary 0.0%, secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 1.5%]	71
Regressions ❌ (secondary)	0.1%	[0.0%, 0.4%]	46
Improvements ✅ (primary)	-0.2%	[-0.6%, -0.0%]	55
Improvements ✅ (secondary)	-0.1%	[-0.5%, -0.0%]	59
All ❌✅ (primary)	0.0%	[-0.6%, 1.5%]	126

Bootstrap: 743.192s -> 754.228s (1.48%)
Artifact size: 372.27 MiB -> 371.85 MiB (-0.11%)

saethlin · 2025-06-03T02:20:20Z

Ugh, that did basically nothing for memory usage. Conceptually, we want to cache the codegen_mir for functions which are instantiated multiple times, but using the query system we can only do that based on the instantiation mode, so we cache all LocalCopy functions. Because if we figure out how many times a function was assigned to a CGU inside codegen, the query system thinks every CGU depends on every other CGU, ruining incremental.

tmiasko · 2025-06-03T07:04:55Z

Please also review the interaction with parameter attributes deduction (i.e., deduced_param_attrs). The current implementation is unsound, since those attributes no longer describe MIR that is used for code generation (attributes might be invalidated by extra transforms).

zachs18 · 2025-07-09T07:44:25Z

compiler/rustc_monomorphize/src/collector.rs

Should the let body = tcx.instance_mir(instance.def); on line 1237 of collect_items_of_instance be let body = tcx.codegen_mir(instance);? It seems like without that change, post-mono MIR passes cannot introduce new things to be codegenned.

Specifically, I rebased on this PR to add a post-mono MIR pass that lowers a new intrinsic to a different call depending on the instantiation (which is why it can't be pre-mono in the existing LowerIntrinsics pass). It mostly works, but it when I add new Call terminators with different callees, those callees don't get codegenned, leading to linking errors (unless I do like #[used] static A: fn() = the_callee; elsewhere in the source). Changing line 1237 to use codegen_mir instead of instance_mir fixes the linker errors. I'm not sure if this is intended to be a supported thing to do with post-mono MIR passes though. (I changed to using a new InstanceKind MIR shim (similar to CloneShim) instead of a new intrinsic, which does not have this issue or require post-mono MIR passes.)

This change does seem to add some duplicate errors in tests/ui/consts/mono-reachable-invalid-const.stderr, tests/ui/inline-const/const-expr-generic-err.stderr, and tests/ui/structs/default-field-values/post-mono.indirect.stderr, though.

Should the let body = tcx.instance_mir(instance.def); on line 1237 of collect_items_of_instance be let body = tcx.codegen_mir(instance);?

No. The code as-written is intentional because it lets us do the "instance MIR is codegen MIR" which is a trick I was hoping would reduce peak memory, but it didn't.

This change does seem to add some duplicate errors

This is not a concern. UI tests have diagnostic deduplication turned off. Many of the UI tests have duplicate diagnostics.

WaffleLapkin · 2025-08-03T12:53:46Z

@saethlin what is the status of this? will it be implemented in any near future?

For tail call work I'm doing it would be very convenient to have a post-mono mir pass.

saethlin · 2025-08-03T13:20:50Z

My current understanding is that the compile time memory overhead is unfixable, so I think there is a decision point here about accepting the max-RSS hit.

Otherwise, this works. The PR just needs a bit of cleanup.

saethlin · 2025-08-05T00:19:29Z

The CI failure is due to a pre-existing bug in GVN, which fires often in the ecosystem but the MIR lint for it is disabled by default and only enabled in the mir-opt test suite.

…aethlin GVN: Do not flatten derefs with ProjectionElem::Index. r? `@saethlin` This should fix the bug you found with rust-lang#131650

Rollup merge of #145030 - cjgillot:gvn-no-flatten-index, r=saethlin GVN: Do not flatten derefs with ProjectionElem::Index. r? `@saethlin` This should fix the bug you found with #131650

GVN: Do not flatten derefs with ProjectionElem::Index. r? `@saethlin` This should fix the bug you found with rust-lang/rust#131650

rust-log-analyzer · 2025-08-11T23:50:44Z

The job aarch64-gnu-llvm-19-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

[RUSTC-TIMING] build_script_build test:false 0.136
[RUSTC-TIMING] cc test:false 0.765
   Compiling compiler_builtins v0.1.160 (/checkout/library/compiler-builtins/compiler-builtins)
[RUSTC-TIMING] build_script_build test:false 0.335
error: internal compiler error: compiler/rustc_mir_transform/src/shim.rs:220:13: creating shims from intrinsics (Intrinsic(DefId(0:2034 ~ core[469a]::intrinsics::is_val_statically_known))) is unsupported


thread 'rustc' panicked at compiler/rustc_mir_transform/src/shim.rs:220:13:
Box<dyn Any>
stack backtrace:
   0: std::panicking::begin_panic::<rustc_errors::ExplicitBug>
   1: <rustc_errors::diagnostic::BugAbort as rustc_errors::diagnostic::EmissionGuarantee>::emit_producing_guarantee
   2: rustc_middle::util::bug::opt_span_bug_fmt::<rustc_span::span_encoding::Span>::{closure#0}
   3: rustc_middle::ty::context::tls::with_opt::<rustc_middle::util::bug::opt_span_bug_fmt<rustc_span::span_encoding::Span>::{closure#0}, !>::{closure#0}
   4: rustc_middle::ty::context::tls::with_context_opt::<rustc_middle::ty::context::tls::with_opt<rustc_middle::util::bug::opt_span_bug_fmt<rustc_span::span_encoding::Span>::{closure#0}, !>::{closure#0}, !>
   5: rustc_middle::util::bug::bug_fmt
   6: rustc_mir_transform::shim::make_shim
      [... omitted 3 frames ...]
   7: <rustc_middle::ty::context::TyCtxt>::instance_mir
   8: rustc_mir_transform::build_codegen_mir
      [... omitted 3 frames ...]
   9: rustc_mir_transform::deduce_param_attrs::deduced_param_attrs
      [... omitted 3 frames ...]
  10: rustc_ty_utils::abi::fn_abi_new_uncached
  11: rustc_ty_utils::abi::fn_abi_of_instance
      [... omitted 3 frames ...]
  12: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine> as rustc_middle::ty::layout::FnAbiOf>::fn_abi_of_instance
  13: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::eval_callee_and_args
  14: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::eval_terminator
  15: rustc_const_eval::const_eval::eval_queries::eval_to_allocation_raw_provider
      [... omitted 3 frames ...]
  16: rustc_const_eval::const_eval::eval_queries::eval_to_const_value_raw_provider
      [... omitted 3 frames ...]
  17: <rustc_middle::ty::context::TyCtxt>::par_hir_body_owners::<rustc_hir_analysis::check_crate::{closure#2}>::{closure#0}
  18: rustc_data_structures::sync::parallel::par_for_each_in::<&rustc_span::def_id::LocalDefId, &[rustc_span::def_id::LocalDefId], <rustc_middle::ty::context::TyCtxt>::par_hir_body_owners<rustc_hir_analysis::check_crate::{closure#2}>::{closure#0}>
  19: rustc_hir_analysis::check_crate
  20: rustc_interface::passes::analysis
      [... omitted 3 frames ...]
  21: <std::thread::local::LocalKey<core::cell::Cell<*const ()>>>::with::<rustc_middle::ty::context::tls::enter_context<<rustc_middle::ty::context::GlobalCtxt>::enter<rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2}::{closure#0}, core::option::Option<rustc_interface::queries::Linker>>::{closure#1}, core::option::Option<rustc_interface::queries::Linker>>::{closure#0}, core::option::Option<rustc_interface::queries::Linker>>
  22: <rustc_middle::ty::context::TyCtxt>::create_global_ctxt::<core::option::Option<rustc_interface::queries::Linker>, rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2}::{closure#0}>
  23: <rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2} as core::ops::function::FnOnce<(&rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2})>>::call_once::{shim:vtable#0}
  24: <alloc::boxed::Box<dyn for<'a> core::ops::function::FnOnce<(&'a rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &'a std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt<'a>>, &'a rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena<'a>>, &'a rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena<'a>>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}), Output = core::option::Option<rustc_interface::queries::Linker>>> as core::ops::function::FnOnce<(&rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2})>>::call_once
  25: rustc_interface::passes::create_and_enter_global_ctxt::<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>
  26: <scoped_tls::ScopedKey<rustc_span::SessionGlobals>>::set::<rustc_interface::util::run_in_thread_with_globals<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<(), rustc_driver_impl::run_compiler::{closure#0}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}::{closure#0}, ()>
  27: rustc_span::create_session_globals_then::<(), rustc_interface::util::run_in_thread_with_globals<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<(), rustc_driver_impl::run_compiler::{closure#0}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}::{closure#0}>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

---
warning: the ICE couldn't be written to `/checkout/rustc-ice-2025-08-11T23_50_21-12402.txt`: Read-only file system (os error 30)

note: rustc 1.91.0-nightly (4588d9229 2025-08-11) running on aarch64-unknown-linux-gnu

note: compiler flags: --crate-type lib -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debug-assertions=on -C symbol-mangling-version=legacy -Z randomize-layout -Z unstable-options -Z macro-backtrace -C split-debuginfo=off -C prefer-dynamic -C llvm-args=-import-instr-limit=10 -Z inline-mir -Z inline-mir-preserve-debug -Z mir_strip_debuginfo=locals-in-tiny-functions -C link-args=-Wl,-z,origin -C link-args=-Wl,-rpath,$ORIGIN/../lib -C embed-bitcode=yes -Z unstable-options -C force-frame-pointers=non-leaf -Z crate-attr=doc(html_root_url="https://doc.rust-lang.org/nightly/") -Z binary-dep-depinfo -Z force-unstable-if-unmarked

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [mir_shims] generating MIR shim for `intrinsics::is_val_statically_known`, instance=Intrinsic(DefId(0:2034 ~ core[469a]::intrinsics::is_val_statically_known))
#1 [build_codegen_mir] finalizing codegen MIR for `intrinsics::is_val_statically_known::<u32>`
#2 [deduced_param_attrs] deducing parameter attributes for intrinsics::is_val_statically_known::<u32> - intrinsic
#3 [fn_abi_of_instance] computing call ABI of `intrinsics::is_val_statically_known::<u32> - intrinsic`
#4 [eval_to_allocation_raw] const-evaluating + checking `num::int_sqrt::U8_ISQRT_WITH_REMAINDER`
#5 [eval_to_const_value_raw] simplifying constant for the type system `num::int_sqrt::U8_ISQRT_WITH_REMAINDER`
#6 [analysis] running analysis passes on this crate
end of query stack
error: internal compiler error: compiler/rustc_mir_transform/src/shim.rs:220:13: creating shims from intrinsics (Intrinsic(DefId(0:1956 ~ core[469a]::intrinsics::ctpop))) is unsupported


thread 'rustc' panicked at compiler/rustc_mir_transform/src/shim.rs:220:13:
Box<dyn Any>
stack backtrace:
   0: std::panicking::begin_panic::<rustc_errors::ExplicitBug>
   1: <rustc_errors::diagnostic::BugAbort as rustc_errors::diagnostic::EmissionGuarantee>::emit_producing_guarantee
   2: rustc_middle::util::bug::opt_span_bug_fmt::<rustc_span::span_encoding::Span>::{closure#0}
   3: rustc_middle::ty::context::tls::with_opt::<rustc_middle::util::bug::opt_span_bug_fmt<rustc_span::span_encoding::Span>::{closure#0}, !>::{closure#0}
   4: rustc_middle::ty::context::tls::with_context_opt::<rustc_middle::ty::context::tls::with_opt<rustc_middle::util::bug::opt_span_bug_fmt<rustc_span::span_encoding::Span>::{closure#0}, !>::{closure#0}, !>
   5: rustc_middle::util::bug::bug_fmt
   6: rustc_mir_transform::shim::make_shim
      [... omitted 3 frames ...]
   7: <rustc_middle::ty::context::TyCtxt>::instance_mir
   8: rustc_mir_transform::build_codegen_mir
      [... omitted 3 frames ...]
   9: rustc_mir_transform::deduce_param_attrs::deduced_param_attrs
      [... omitted 3 frames ...]
  10: rustc_ty_utils::abi::fn_abi_new_uncached
  11: rustc_ty_utils::abi::fn_abi_of_instance
      [... omitted 3 frames ...]
  12: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine> as rustc_middle::ty::layout::FnAbiOf>::fn_abi_of_instance
  13: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::eval_callee_and_args
  14: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::eval_terminator
  15: rustc_const_eval::const_eval::eval_queries::eval_to_allocation_raw_provider
      [... omitted 3 frames ...]
  16: rustc_const_eval::const_eval::eval_queries::eval_to_const_value_raw_provider
      [... omitted 3 frames ...]
  17: <rustc_middle::ty::context::TyCtxt>::const_eval_global_id
  18: <rustc_middle::ty::context::TyCtxt>::const_eval_resolve
  19: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::push_stack_frame_raw
  20: rustc_const_eval::const_eval::eval_queries::eval_to_allocation_raw_provider
      [... omitted 3 frames ...]
  21: rustc_const_eval::const_eval::eval_queries::eval_to_const_value_raw_provider
      [... omitted 3 frames ...]
  22: <rustc_middle::ty::context::TyCtxt>::par_hir_body_owners::<rustc_hir_analysis::check_crate::{closure#2}>::{closure#0}
  23: rustc_data_structures::sync::parallel::par_for_each_in::<&rustc_span::def_id::LocalDefId, &[rustc_span::def_id::LocalDefId], <rustc_middle::ty::context::TyCtxt>::par_hir_body_owners<rustc_hir_analysis::check_crate::{closure#2}>::{closure#0}>
  24: rustc_hir_analysis::check_crate
  25: rustc_interface::passes::analysis
      [... omitted 3 frames ...]
  26: <std::thread::local::LocalKey<core::cell::Cell<*const ()>>>::with::<rustc_middle::ty::context::tls::enter_context<<rustc_middle::ty::context::GlobalCtxt>::enter<rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2}::{closure#0}, core::option::Option<rustc_interface::queries::Linker>>::{closure#1}, core::option::Option<rustc_interface::queries::Linker>>::{closure#0}, core::option::Option<rustc_interface::queries::Linker>>
  27: <rustc_middle::ty::context::TyCtxt>::create_global_ctxt::<core::option::Option<rustc_interface::queries::Linker>, rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2}::{closure#0}>
  28: <rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2} as core::ops::function::FnOnce<(&rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2})>>::call_once::{shim:vtable#0}
  29: <alloc::boxed::Box<dyn for<'a> core::ops::function::FnOnce<(&'a rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &'a std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt<'a>>, &'a rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena<'a>>, &'a rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena<'a>>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}), Output = core::option::Option<rustc_interface::queries::Linker>>> as core::ops::function::FnOnce<(&rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2})>>::call_once
  30: rustc_interface::passes::create_and_enter_global_ctxt::<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>
  31: <scoped_tls::ScopedKey<rustc_span::SessionGlobals>>::set::<rustc_interface::util::run_in_thread_with_globals<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<(), rustc_driver_impl::run_compiler::{closure#0}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}::{closure#0}, ()>
  32: rustc_span::create_session_globals_then::<(), rustc_interface::util::run_in_thread_with_globals<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<(), rustc_driver_impl::run_compiler::{closure#0}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}::{closure#0}>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

---
warning: the ICE couldn't be written to `/checkout/rustc-ice-2025-08-11T23_50_21-12402.txt`: Read-only file system (os error 30)

note: rustc 1.91.0-nightly (4588d9229 2025-08-11) running on aarch64-unknown-linux-gnu

note: compiler flags: --crate-type lib -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debug-assertions=on -C symbol-mangling-version=legacy -Z randomize-layout -Z unstable-options -Z macro-backtrace -C split-debuginfo=off -C prefer-dynamic -C llvm-args=-import-instr-limit=10 -Z inline-mir -Z inline-mir-preserve-debug -Z mir_strip_debuginfo=locals-in-tiny-functions -C link-args=-Wl,-z,origin -C link-args=-Wl,-rpath,$ORIGIN/../lib -C embed-bitcode=yes -Z unstable-options -C force-frame-pointers=non-leaf -Z crate-attr=doc(html_root_url="https://doc.rust-lang.org/nightly/") -Z binary-dep-depinfo -Z force-unstable-if-unmarked

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [mir_shims] generating MIR shim for `intrinsics::ctpop`, instance=Intrinsic(DefId(0:1956 ~ core[469a]::intrinsics::ctpop))
#1 [build_codegen_mir] finalizing codegen MIR for `intrinsics::ctpop::<u16>`
#2 [deduced_param_attrs] deducing parameter attributes for intrinsics::ctpop::<u16> - intrinsic
#3 [fn_abi_of_instance] computing call ABI of `intrinsics::ctpop::<u16> - intrinsic`  |  = note: this failure-note originates in the macro `uint_impl` (in Nightly builds, run with -Z macro-backtrace for more info)

#4 [eval_to_allocation_raw] const-evaluating + checking `num::<impl at library/core/src/num/mod.rs:1080:1: 1080:9>::BITS`
#5 [eval_to_const_value_raw] simplifying constant for the type system `num::<impl at library/core/src/num/mod.rs:1080:1: 1080:9>::BITS`  |  = note: this failure-note originates in the macro `last_stage` (in Nightly builds, run with -Z macro-backtrace for more info)

#6 [eval_to_allocation_raw] const-evaluating + checking `num::int_sqrt::u16_stages::HALF_BITS`
#7 [eval_to_const_value_raw] simplifying constant for the type system `num::int_sqrt::u16_stages::HALF_BITS`
#8 [analysis] running analysis passes on this crate
end of query stack
error: internal compiler error: compiler/rustc_mir_transform/src/shim.rs:220:13: creating shims from intrinsics (Intrinsic(DefId(0:1958 ~ core[469a]::intrinsics::ctlz))) is unsupported


thread 'rustc' panicked at compiler/rustc_mir_transform/src/shim.rs:220:13:
Box<dyn Any>
stack backtrace:
   0: std::panicking::begin_panic::<rustc_errors::ExplicitBug>
   1: <rustc_errors::diagnostic::BugAbort as rustc_errors::diagnostic::EmissionGuarantee>::emit_producing_guarantee
   2: rustc_middle::util::bug::opt_span_bug_fmt::<rustc_span::span_encoding::Span>::{closure#0}
   3: rustc_middle::ty::context::tls::with_opt::<rustc_middle::util::bug::opt_span_bug_fmt<rustc_span::span_encoding::Span>::{closure#0}, !>::{closure#0}
   4: rustc_middle::ty::context::tls::with_context_opt::<rustc_middle::ty::context::tls::with_opt<rustc_middle::util::bug::opt_span_bug_fmt<rustc_span::span_encoding::Span>::{closure#0}, !>::{closure#0}, !>
   5: rustc_middle::util::bug::bug_fmt
   6: rustc_mir_transform::shim::make_shim
      [... omitted 3 frames ...]
   7: <rustc_middle::ty::context::TyCtxt>::instance_mir
   8: rustc_mir_transform::build_codegen_mir
      [... omitted 3 frames ...]
   9: rustc_mir_transform::deduce_param_attrs::deduced_param_attrs
      [... omitted 3 frames ...]
  10: rustc_ty_utils::abi::fn_abi_new_uncached
  11: rustc_ty_utils::abi::fn_abi_of_instance
      [... omitted 3 frames ...]
  12: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine> as rustc_middle::ty::layout::FnAbiOf>::fn_abi_of_instance
  13: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::eval_callee_and_args
  14: <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeMachine>>::eval_terminator
  15: rustc_const_eval::const_eval::eval_queries::eval_to_allocation_raw_provider
      [... omitted 3 frames ...]
  16: rustc_const_eval::const_eval::eval_queries::eval_to_const_value_raw_provider
      [... omitted 3 frames ...]
  17: <rustc_middle::ty::context::TyCtxt>::par_hir_body_owners::<rustc_hir_analysis::check_crate::{closure#2}>::{closure#0}
  18: rustc_data_structures::sync::parallel::par_for_each_in::<&rustc_span::def_id::LocalDefId, &[rustc_span::def_id::LocalDefId], <rustc_middle::ty::context::TyCtxt>::par_hir_body_owners<rustc_hir_analysis::check_crate::{closure#2}>::{closure#0}>
  19: rustc_hir_analysis::check_crate
  20: rustc_interface::passes::analysis
      [... omitted 3 frames ...]
  21: <std::thread::local::LocalKey<core::cell::Cell<*const ()>>>::with::<rustc_middle::ty::context::tls::enter_context<<rustc_middle::ty::context::GlobalCtxt>::enter<rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2}::{closure#0}, core::option::Option<rustc_interface::queries::Linker>>::{closure#1}, core::option::Option<rustc_interface::queries::Linker>>::{closure#0}, core::option::Option<rustc_interface::queries::Linker>>
  22: <rustc_middle::ty::context::TyCtxt>::create_global_ctxt::<core::option::Option<rustc_interface::queries::Linker>, rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2}::{closure#0}>
  23: <rustc_interface::passes::create_and_enter_global_ctxt<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>::{closure#2} as core::ops::function::FnOnce<(&rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2})>>::call_once::{shim:vtable#0}
  24: <alloc::boxed::Box<dyn for<'a> core::ops::function::FnOnce<(&'a rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &'a std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt<'a>>, &'a rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena<'a>>, &'a rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena<'a>>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}), Output = core::option::Option<rustc_interface::queries::Linker>>> as core::ops::function::FnOnce<(&rustc_session::session::Session, rustc_middle::ty::context::CurrentGcx, alloc::sync::Arc<rustc_data_structures::jobserver::Proxy>, &std::sync::once_lock::OnceLock<rustc_middle::ty::context::GlobalCtxt>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_middle::arena::Arena>, &rustc_data_structures::sync::worker_local::WorkerLocal<rustc_hir::Arena>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2})>>::call_once
  25: rustc_interface::passes::create_and_enter_global_ctxt::<core::option::Option<rustc_interface::queries::Linker>, rustc_driver_impl::run_compiler::{closure#0}::{closure#2}>
  26: <scoped_tls::ScopedKey<rustc_span::SessionGlobals>>::set::<rustc_interface::util::run_in_thread_with_globals<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<(), rustc_driver_impl::run_compiler::{closure#0}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}::{closure#0}, ()>
  27: rustc_span::create_session_globals_then::<(), rustc_interface::util::run_in_thread_with_globals<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<(), rustc_driver_impl::run_compiler::{closure#0}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}::{closure#0}>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

---
warning: the ICE couldn't be written to `/checkout/rustc-ice-2025-08-11T23_50_21-12402.txt`: Read-only file system (os error 30)

note: rustc 1.91.0-nightly (4588d9229 2025-08-11) running on aarch64-unknown-linux-gnu

note: compiler flags: --crate-type lib -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debug-assertions=on -C symbol-mangling-version=legacy -Z randomize-layout -Z unstable-options -Z macro-backtrace -C split-debuginfo=off -C prefer-dynamic -C llvm-args=-import-instr-limit=10 -Z inline-mir -Z inline-mir-preserve-debug -Z mir_strip_debuginfo=locals-in-tiny-functions -C link-args=-Wl,-z,origin -C link-args=-Wl,-rpath,$ORIGIN/../lib -C embed-bitcode=yes -Z unstable-options -C force-frame-pointers=non-leaf -Z crate-attr=doc(html_root_url="https://doc.rust-lang.org/nightly/") -Z binary-dep-depinfo -Z force-unstable-if-unmarked

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [mir_shims] generating MIR shim for `intrinsics::ctlz`, instance=Intrinsic(DefId(0:1958 ~ core[469a]::intrinsics::ctlz))
#1 [build_codegen_mir] finalizing codegen MIR for `intrinsics::ctlz::<u16>`
#2 [deduced_param_attrs] deducing parameter attributes for intrinsics::ctlz::<u16> - intrinsic
#3 [fn_abi_of_instance] computing call ABI of `intrinsics::ctlz::<u16> - intrinsic`  |  = note: this failure-note originates in the macro `int_impl` (in Nightly builds, run with -Z macro-backtrace for more info)

#4 [eval_to_allocation_raw] const-evaluating + checking `num::<impl at library/core/src/num/mod.rs:270:1: 270:9>::checked_isqrt::MAX_RESULT`
#5 [eval_to_const_value_raw] simplifying constant for the type system `num::<impl at library/core/src/num/mod.rs:270:1: 270:9>::checked_isqrt::MAX_RESULT`
#6 [analysis] running analysis passes on this crate
end of query stack
[RUSTC-TIMING] core test:false 13.333
error: could not compile `core` (lib)

Caused by:
  process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc /checkout/obj/build/bootstrap/debug/rustc --crate-name core --edition=2024 library/core/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 --warn=unexpected_cfgs --check-cfg 'cfg(no_fp_fmt_parse)' --check-cfg 'cfg(feature, values(any()))' --check-cfg 'cfg(target_has_reliable_f16)' --check-cfg 'cfg(target_has_reliable_f16_math)' --check-cfg 'cfg(target_has_reliable_f128)' --check-cfg 'cfg(target_has_reliable_f128_math)' -C debug-assertions=on --check-cfg 'cfg(docsrs,test)' --check-cfg 'cfg(feature, values("debug_refcell", "optimize_for_size", "panic_immediate_abort"))' -C metadata=b25c20fe3ed30f4a -C extra-filename=-b49d49c95d9ada77 --out-dir /checkout/obj/build/aarch64-unknown-linux-gnu/stage1-std/aarch64-unknown-linux-gnu/release/deps --target aarch64-unknown-linux-gnu -L dependency=/checkout/obj/build/aarch64-unknown-linux-gnu/stage1-std/aarch64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/aarch64-unknown-linux-gnu/stage1-std/release/deps -Csymbol-mangling-version=legacy -Zrandomize-layout '--check-cfg=cfg(feature,values(any()))' -Zunstable-options -Zmacro-backtrace -Csplit-debuginfo=off -Cprefer-dynamic -Cllvm-args=-import-instr-limit=10 --cfg=randomized_layouts -Zinline-mir -Zinline-mir-preserve-debug -Zmir_strip_debuginfo=locals-in-tiny-functions -Clink-args=-Wl,-z,origin '-Clink-args=-Wl,-rpath,$ORIGIN/../lib' -Alinker-messages -Cembed-bitcode=yes -Zunstable-options -Cforce-frame-pointers=non-leaf '-Zcrate-attr=doc(html_root_url="https://doc.rust-lang.org/nightly/")' -Z binary-dep-depinfo` (exit status: 101)
Build completed unsuccessfully in 0:07:43
  local time: Mon Aug 11 23:50:34 UTC 2025
  network time: Mon, 11 Aug 2025 23:50:34 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

tmiasko · 2025-08-12T05:35:32Z

compiler/rustc_ty_utils/src/abi.rs

        let deduced_param_attrs =
            if tcx.sess.opts.optimize != OptLevel::No && tcx.sess.opts.incremental.is_none() {
-                fn_def_id.map(|fn_def_id| tcx.deduced_param_attrs(fn_def_id)).unwrap_or_default()
+                instance.map(|instance| tcx.deduced_param_attrs(instance)).unwrap_or_default()


To follow up on my earlier comment #131650 (comment). The MIR used for deduction is potentially inconsistent with one actually used for code generation, since this could be a cross crate call, where crates are built with different optimization levels.

Fix tail calls to `#[track_caller]` functions We want `#[track_caller]` to be semver independent, i.e. it should not be a breaking change to add or remove it. Since it changes ABI of a function (adding an additional argument) we have to be careful to preserve this property when adding tail calls. The only way to achieve this that I can see is: - we forbid tail calls in functions which are marked with `#[track_caller]` (already implemented) - tail-calling a `#[track_caller]` marked function downgrades the tail-call to a normal call (or equivalently tail-calls the shim made by fn def to fn ptr cast) (this pr) Ideally the downgrade would be performed by a MIR pass, but that requires post mono MIR opts (cc `@saethlin,` rust-lang#131650). For now I've changed code in cg_ssa to accomodate this behaviour (+ added a hack to mono collector so that the shim is actually generated) Additionally I added a lint, although I don't think it's strictly necessary. Alternative to rust-lang#144762 (and thus closes rust-lang#144762) Fixes rust-lang#144755

Fix tail calls to `#[track_caller]` functions We want `#[track_caller]` to be semver independent, i.e. it should not be a breaking change to add or remove it. Since it changes ABI of a function (adding an additional argument) we have to be careful to preserve this property when adding tail calls. The only way to achieve this that I can see is: - we forbid tail calls in functions which are marked with `#[track_caller]` (already implemented) - tail-calling a `#[track_caller]` marked function downgrades the tail-call to a normal call (or equivalently tail-calls the shim made by fn def to fn ptr cast) (this pr) Ideally the downgrade would be performed by a MIR pass, but that requires post mono MIR opts (cc ``@saethlin,`` rust-lang#131650). For now I've changed code in cg_ssa to accomodate this behaviour (+ added a hack to mono collector so that the shim is actually generated) Additionally I added a lint, although I don't think it's strictly necessary. Alternative to rust-lang#144762 (and thus closes rust-lang#144762) Fixes rust-lang#144755

Fix tail calls to `#[track_caller]` functions We want `#[track_caller]` to be semver independent, i.e. it should not be a breaking change to add or remove it. Since it changes ABI of a function (adding an additional argument) we have to be careful to preserve this property when adding tail calls. The only way to achieve this that I can see is: - we forbid tail calls in functions which are marked with `#[track_caller]` (already implemented) - tail-calling a `#[track_caller]` marked function downgrades the tail-call to a normal call (or equivalently tail-calls the shim made by fn def to fn ptr cast) (this pr) Ideally the downgrade would be performed by a MIR pass, but that requires post mono MIR opts (cc ```@saethlin,``` rust-lang#131650). For now I've changed code in cg_ssa to accomodate this behaviour (+ added a hack to mono collector so that the shim is actually generated) Additionally I added a lint, although I don't think it's strictly necessary. Alternative to rust-lang#144762 (and thus closes rust-lang#144762) Fixes rust-lang#144755

Rollup merge of #144865 - WaffleLapkin:track-tail, r=lqd Fix tail calls to `#[track_caller]` functions We want `#[track_caller]` to be semver independent, i.e. it should not be a breaking change to add or remove it. Since it changes ABI of a function (adding an additional argument) we have to be careful to preserve this property when adding tail calls. The only way to achieve this that I can see is: - we forbid tail calls in functions which are marked with `#[track_caller]` (already implemented) - tail-calling a `#[track_caller]` marked function downgrades the tail-call to a normal call (or equivalently tail-calls the shim made by fn def to fn ptr cast) (this pr) Ideally the downgrade would be performed by a MIR pass, but that requires post mono MIR opts (cc ```@saethlin,``` #131650). For now I've changed code in cg_ssa to accomodate this behaviour (+ added a hack to mono collector so that the shim is actually generated) Additionally I added a lint, although I don't think it's strictly necessary. Alternative to #144762 (and thus closes #144762) Fixes #144755

bors · 2025-08-15T12:50:18Z

☔ The latest upstream changes (presumably #145423) made this pull request unmergeable. Please resolve the merge conflicts.

Fix tail calls to `#[track_caller]` functions We want `#[track_caller]` to be semver independent, i.e. it should not be a breaking change to add or remove it. Since it changes ABI of a function (adding an additional argument) we have to be careful to preserve this property when adding tail calls. The only way to achieve this that I can see is: - we forbid tail calls in functions which are marked with `#[track_caller]` (already implemented) - tail-calling a `#[track_caller]` marked function downgrades the tail-call to a normal call (or equivalently tail-calls the shim made by fn def to fn ptr cast) (this pr) Ideally the downgrade would be performed by a MIR pass, but that requires post mono MIR opts (cc ```@saethlin,``` rust-lang/rust#131650). For now I've changed code in cg_ssa to accomodate this behaviour (+ added a hack to mono collector so that the shim is actually generated) Additionally I added a lint, although I don't think it's strictly necessary. Alternative to rust-lang/rust#144762 (and thus closes rust-lang/rust#144762) Fixes rust-lang/rust#144755

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 13, 2024

This comment has been minimized.

Sign in to view

rustbot added the PG-exploit-mitigations Project group: Exploit mitigations label Oct 13, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 13, 2024

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 13, 2024

Auto merge of rust-lang#131650 - saethlin:post-mono-mir-opts, r=<try>

b141564

Add post-mono MIR passes to make mono-reachable analysis more accurate r? ghost

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 13, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 14, 2024

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 14, 2024

Auto merge of rust-lang#131650 - saethlin:post-mono-mir-opts, r=<try>

9233d9f

Add post-mono MIR passes to make mono-reachable analysis more accurate r? ghost

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 14, 2024

saethlin force-pushed the post-mono-mir-opts branch from 6f6737a to 4ae3542 Compare October 24, 2024 02:45

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 24, 2024

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 3, 2025

zachs18 reviewed Jul 9, 2025

View reviewed changes

WaffleLapkin mentioned this pull request Aug 3, 2025

Fix tail calls to #[track_caller] functions #144865

Merged

saethlin force-pushed the post-mono-mir-opts branch from 1bbdbec to 21ef8e6 Compare August 3, 2025 20:19

rustbot added the T-clippy Relevant to the Clippy team. label Aug 3, 2025

This comment has been minimized.

Sign in to view

saethlin mentioned this pull request Aug 5, 2025

"use of local which has no storage here" at -Zmir-opt-level=0 #144932

Open

cjgillot mentioned this pull request Aug 7, 2025

GVN: Do not flatten derefs with ProjectionElem::Index. #145030

Merged

rust-timer added a commit that referenced this pull request Aug 8, 2025

Unrolled build for #145030

ae538ba

Rollup merge of #145030 - cjgillot:gvn-no-flatten-index, r=saethlin GVN: Do not flatten derefs with ProjectionElem::Index. r? `@saethlin` This should fix the bug you found with #131650

saethlin force-pushed the post-mono-mir-opts branch from 21ef8e6 to d56fc51 Compare August 9, 2025 03:20

saethlin added 2 commits August 11, 2025 19:20

Add post-mono passes to make mono-reachable analysis more accurate

7964969

Deduce param attrs from codegen MIR

adeb3f4

saethlin force-pushed the post-mono-mir-opts branch from d56fc51 to adeb3f4 Compare August 11, 2025 23:37

tmiasko reviewed Aug 12, 2025

View reviewed changes

cjgillot mentioned this pull request Sep 7, 2025

Remove deduce_param_attrs. #146309

Closed

Add post-mono MIR optimizations #131650

Are you sure you want to change the base?

Add post-mono MIR optimizations #131650

Uh oh!

Conversation

saethlin commented Oct 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

saethlin commented Oct 13, 2024

Uh oh!

This comment has been minimized.

bors commented Oct 13, 2024

Uh oh!

bors commented Oct 13, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Oct 13, 2024

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

saethlin commented Oct 14, 2024

Uh oh!

This comment has been minimized.

bors commented Oct 14, 2024

Uh oh!

bors commented Oct 14, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Oct 14, 2024

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

saethlin commented Oct 24, 2024

Uh oh!

This comment has been minimized.

bors commented Oct 24, 2024

Uh oh!

rust-timer commented Jun 3, 2025

Overall result: ❌ regressions - please read the text below

Uh oh!

saethlin commented Jun 3, 2025

Uh oh!

tmiasko commented Jun 3, 2025

Uh oh!

zachs18 Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WaffleLapkin commented Aug 3, 2025

Uh oh!

saethlin commented Aug 3, 2025

Uh oh!

This comment has been minimized.

saethlin commented Aug 5, 2025

Uh oh!

rust-log-analyzer commented Aug 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bors commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

saethlin commented Oct 13, 2024 •

edited

Loading

zachs18 Jul 9, 2025 •

edited

Loading