Deduce captures(none) for a return place and parameters #147890

tmiasko · 2025-10-19T19:21:00Z

Extend attribute deduction to determine whether parameters using
indirect pass mode might have their address captured. Similarly to
the deduction of readonly attribute this information facilitates
memcpy optimizations.

tmiasko · 2025-10-19T19:25:00Z

@bors try @rust-timer queue

deduced_param_attrs: deduce captures(none)

rust-bors · 2025-10-19T19:33:55Z

💔 Test for abbaf96 failed: CI

tmiasko · 2025-10-19T19:43:09Z

@bors try @rust-timer queue

deduced_param_attrs: deduce captures(none)

bors · 2025-10-19T21:57:27Z

☔ The latest upstream changes (presumably #147884) made this pull request unmergeable. Please resolve the merge conflicts.

rust-bors · 2025-10-19T22:04:36Z

☀️ Try build successful (CI)
Build commit: 470704e (470704e31911d010ae90e062e13a88db578f8f08, parent: c6efb9019b3169fc672248339dbbf13e6a134de3)

rust-timer · 2025-10-19T23:14:38Z

Finished benchmarking commit (470704e): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.3%	[0.1%, 1.1%]	5
Improvements ✅ (primary)	-0.3%	[-0.4%, -0.2%]	8
Improvements ✅ (secondary)	-0.4%	[-1.2%, -0.1%]	27
All ❌✅ (primary)	-0.3%	[-0.4%, -0.2%]	8

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (primary 2.8%, secondary 3.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.8%	[2.8%, 2.8%]	1
Regressions ❌ (secondary)	3.4%	[2.6%, 4.2%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.8%	[2.8%, 2.8%]	1

Binary size

Results (primary 0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.1%]	10
Regressions ❌ (secondary)	0.1%	[0.0%, 0.2%]	10
Improvements ✅ (primary)	-0.0%	[-0.0%, -0.0%]	4
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	35
All ❌✅ (primary)	0.0%	[-0.0%, 0.1%]	14

Bootstrap: 472.659s -> 472.043s (-0.13%)
Artifact size: 390.57 MiB -> 390.61 MiB (0.01%)

rustbot · 2025-10-22T17:14:55Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

tmiasko · 2025-10-22T17:17:11Z

r? compiler

compiler/rustc_middle/src/middle/deduced_param_attrs.rs

compiler/rustc_ty_utils/src/abi.rs

cjgillot · 2025-10-23T02:26:11Z

compiler/rustc_mir_transform/src/deduce_param_attrs.rs

-                    // We're only interested in arguments.
-                    && let Some(param_index) = self.as_param(place.local)
                    && !place.is_indirect_first_projection()
+                    && let Some(i) = self.as_param(place.local)


Should we also check whether that argument's type may be passed indirectly? In case of projections in particular.

I would leave further improvements for a separate pull request (although, it looks like we have an extra temporary in such a case regardless).

madsmtm

I looked through it, the code changes look correct but I don't feel like I have the required background to confidently review this from a higher level.

r? nikic (since you seem to have touched similar stuff in #145877)

View changes since this review

compiler/rustc_codegen_llvm/src/abi.rs

compiler/rustc_middle/src/middle/deduced_param_attrs.rs

compiler/rustc_mir_transform/src/deduce_param_attrs.rs

Extend attribute deduction to determine whether parameters using indirect pass mode might have their address captured. Similarly to the deduction of `readonly` attribute this information facilitates memcpy optimizations.

rustbot · 2025-10-25T20:54:55Z

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

tmiasko · 2025-10-25T21:09:52Z

r? cjgillot

cjgillot · 2025-10-26T19:08:40Z

@bors r+

bors · 2025-10-26T19:08:43Z

📌 Commit 2a03a94 has been approved by cjgillot

It is now in the queue for this repository.

bors · 2025-10-26T20:37:06Z

⌛ Testing commit 2a03a94 with merge f37aa99...

bors · 2025-10-26T23:43:44Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing f37aa99 to master...

github-actions · 2025-10-26T23:47:24Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing f977dfc (parent) -> f37aa99 (this PR)

Test differences

Show 13 test diffs

Stage 1

[codegen] tests/codegen-llvm/deduced-param-attrs.rs: pass -> [missing] (J0)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM20: [missing] -> pass (J0)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM21: [missing] -> ignore (ignored when the LLVM version 20.1.2 is older than 21.0.0) (J0)

Stage 2

[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM20: [missing] -> ignore (ignored when the LLVM version (21.1.3) is newer than majorversion 20) (J1)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM21: [missing] -> pass (J1)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM21: [missing] -> ignore (ignored when the LLVM version 20.1.2 is older than 21.0.0) (J2)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM21: [missing] -> ignore (ignored when the LLVM version 20.1.8 is older than 21.0.0) (J3)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs: pass -> [missing] (J4)
[codegen] tests/codegen-llvm/deduced-param-attrs.rs#LLVM20: [missing] -> pass (J5)

Additionally, 4 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard f37aa9955f03bb1bc6fe08670cb1ecae534b5815 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

x86_64-gnu-gcc: 2776.9s -> 3566.8s (28.4%)
pr-check-1: 1463.9s -> 1736.9s (18.6%)
aarch64-apple: 6895.9s -> 8122.0s (17.8%)
dist-x86_64-apple: 7093.9s -> 5875.0s (-17.2%)
x86_64-gnu-llvm-20-1: 3067.9s -> 3546.5s (15.6%)
i686-gnu-1: 6886.3s -> 7839.3s (13.8%)
armhf-gnu: 4718.5s -> 5340.5s (13.2%)
x86_64-gnu-stable: 6521.7s -> 7372.5s (13.0%)
aarch64-msvc-2: 4593.5s -> 5174.7s (12.7%)
i686-gnu-nopt-1: 7283.8s -> 8196.2s (12.5%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-10-27T02:24:33Z

Finished benchmarking commit (f37aa99): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.3%]	1
Regressions ❌ (secondary)	0.5%	[0.1%, 1.1%]	3
Improvements ✅ (primary)	-0.2%	[-0.5%, -0.1%]	31
Improvements ✅ (secondary)	-0.3%	[-0.6%, -0.0%]	34
All ❌✅ (primary)	-0.2%	[-0.5%, 0.3%]	32

Max RSS (memory usage)

Results (primary -0.9%, secondary 0.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.3%	[1.3%, 1.3%]	1
Regressions ❌ (secondary)	3.4%	[3.4%, 3.4%]	1
Improvements ✅ (primary)	-3.1%	[-3.1%, -3.1%]	1
Improvements ✅ (secondary)	-3.0%	[-3.0%, -3.0%]	1
All ❌✅ (primary)	-0.9%	[-3.1%, 1.3%]	2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary 0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.1%]	10
Regressions ❌ (secondary)	0.1%	[0.0%, 0.2%]	10
Improvements ✅ (primary)	-0.0%	[-0.0%, -0.0%]	4
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	38
All ❌✅ (primary)	0.0%	[-0.0%, 0.1%]	14

Bootstrap: 473.751s -> 473.876s (0.03%)
Artifact size: 390.43 MiB -> 390.50 MiB (0.02%)

panstromek · 2025-10-27T17:02:07Z

perf triage:

Improvements outweigh regressions.

@rustbot label: +perf-regression-triaged