-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Open
Labels
A-trait-systemArea: Trait systemArea: Trait systemC-bugCategory: This is a bug.Category: This is a bug.E-needs-mcveCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleI-compiletimeIssue: Problems and improvements with respect to compile times.Issue: Problems and improvements with respect to compile times.P-highHigh priorityHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.T-typesRelevant to the types team, which will review and decide on the PR/issue.Relevant to the types team, which will review and decide on the PR/issue.regression-from-stable-to-betaPerformance or correctness regression from stable to beta.Performance or correctness regression from stable to beta.
Description
Summary
Compilation time regression from 1.95 nightly onward.
1.94 nightly (nightly-2026-01-15) is not affected.
EDIT: this issue affects both LLVm and cranelift backends.
Command used
cargo build in a fresh (cargo init + cargo add bevy@0.18.1) bevy repository. Nothing fancy.
Expected vs Actual behaviour
- 1.94 stable cold compilation time: 48.5 seconds.
- 1.94 nightly cold compilation time: 46.9 seconds.
- 1.95 nightly cold compilation time: 85.8 seconds.
- 1.96 nightly cold compilation time: 86 seconds.
Configuration
[unstable] [build], [target] are empty. No env / local crate override. Built with time cargo build --profile x.
My .cargo/config.toml:
[profile.x]
inherits = "dev"
debug-assertions = true
incremental = false
# default, will change
opt-level = 0
debug = true
split-debuginfo = 'off'
strip = "none"
lto = false
panic = 'unwind'
codegen-units = 256Afterwards, I toggled these options one by one (+ cranelift + Ctarget-cpu + Zthreads + Zshare-generics + linker + mold + wild) and results are consistent: compilation time roughly doubles in all scenarios, it does not seem to be related to an option.
Operating system
EndeavourOS (up to date)
HEAD
(version 1.94 nightly)
rustup toolchain install nightly-2026-01-15
rustup override set nightly-2026-01-15(version 1.95 nightly)
rustup toolchain install nightly-2026-02-25
rustup override set nightly-2026-02-25Additional context
Build comparison table
Stable 1.94
| stable | time (secs) | bin size (mo) | Disk deps (go) | Disk deps (go) w/ loop | bin size (mo) w/ loop | exec speed(ms)w/loop | Notes |
|---|---|---|---|---|---|---|---|
| REFERENCE | 48.5 | 3.8 | 5.6 | 6.9 | 1100 | 158 | The reference for all tests |
| codegen-units=1 | 53.3 | 3.8 | 4 | 4.7 | 784 | 158 | |
| codegen-units=16 | 46.7 | 3.8 | 4.4 | 5.5 | 887 | 158 | Sweet spot. |
| codegen-units=32 | 50 | 3.8 | 4 | 6.7 | 939 | 158 | |
| Ctarget-cpu=native | 48.7 | 3.8 | 6.9 | 6.9 | 1100 | 160 | |
| debug-assertions=false | 47.7 | - | - | 6.8 | 1100 | 141.6 | |
| debug-assertions=false + debug=false | 42.5 | - | - | 2.3 | 87 | 130 | |
| debug=false | 42.5 | 0.43 | 1.8 | 2.3 | 91 | 152 | |
| debug="line-tables-only | 43.6 | 0.43 | 1.9 | 3 | 271 | 152.4 | exec speed anomaly |
| split-debuginfo="unpacked" | 48.25 | - | - | 5.8 | 221 | 158 | |
| opt-level=1 | 83.7 | 3.8 | 5.1 | 7.7 | 1400 | 11.9 | |
| opt-level=2 | 95.8 | 3.8 | 5.7 | 8.7 | 1500 | 11.2 | |
| opt-level=3 | 96.4 | 3.8 | 5.8 | 8.9 | 1600 | 10.8 | |
| opt-level=3 + codegen-units=1 | 148 | - | - | 5.1 | 945 | 10.8 | Affects compilation time |
| opt-level=3 + codegen-units=16 | 92 | - | - | 7.3 | 1300 | 10.8 | |
| opt-level="s" | 78.9 | 3.8 | 4.4 | 6.5 | 1100 | 14.2 | |
| lto="fat" | (46 / ) 897 | 4.4 | 4.4 | 9.9 | 791 | 158 | Takes forever to build local crate |
| lto="thin" | (50 / ) 62.4 | 2.3 | 5.3 | 5 | 312 | 158 | Big local crates make compilation very long. |
| lto="thin" + codegen-units=16 | 60 | - | - | 4.5 | 255 | 158 | Big local crates make compilation very long. |
| panic="abort" | 37 / 51.95 | 3.8 | 5.5 | 7.6 | 1100 | 153 | Surprisingly, improves performance |
| strip="debuginfo" | 47.7 | 0.43 | 5.5 | 5 | 90 | 158 | |
| strip="symbols" | 47.8 | 0.345 | 4.8 | 4.9 | 55 | 158.4 | |
| REF1 = opt-level = 3 +lto = "thin" +debug = false +strip = "symbols" +panic = "abort" +codegen-units = 1 +split-debuginfo = "off" +Ctarget-cpu=native | 103 | 1.5 | 23 | 11.2 | BiS for release builds | ||
| REF1 but codegen-units = 16 | 72.8 | 1.7 | 37 | 11.1 | |||
| REF1 + built from RAM (build.target-dir = "/tmp/cargo-target") | 103 | 1.5 | 23 | 11.2 | df /tmp outputs tmpfs which indicates /tmp is in my RAM.No difference here means my SSD is not a bottleneck. | ||
| REF1 + mold linker | 103 | 1.5 | 23 | 11.2 | unaffected | ||
| REF1 + wild linker | 103 | 1.5 | 23 | 11.2 | unaffected |
Nightly 1.94
| nightly | time (secs) | bin size (mo) | Disk deps (go) | Disk deps (go)main | main bin size (mo) | exec speed(ms) | Notes |
|---|---|---|---|---|---|---|---|
| REFERENCE | 46.9 | 6.8 | 1.1 | 158 | |||
| codegen-units=1 | 52.9 | 5.2 | 912 | 157 | |||
| Zthreads=1 | 46.9 | 4.2 | 5.2 | 7.6 | 1200 | 158 | Seems to be the default value |
| Zthreads=8 | 33.8 | 4.2 | 5.2 | 7.6 | 1200 | 158 | != 1.95: here sweet spot in 16 |
| Zthreads=8 + codegen-units=16 | 31.4 | 4.2 | 5.2 | 7.6 | 1200 | 158 | |
| Zthreads=16 | 33.4 | 4.2 | 5.2 | 7.6 | 1200 | 158 | 9950X has 16 cores |
| Zthreads=32 | 40 | 4.2 | 5.2 | 7.6 | 1200 | 158 | 9950X has 32 threads |
| Zshare-generics=y | 48.6 | 4.2 | 5.2 | 6.1 | 1000 | - | No effect on REFERENCE configuration. |
| Ctarget-cpu=native | 49.2 | 4.2 | 7.6 | 7.6 | 1.2 | 166 | |
| debug=false | 43.15 | 0.44 | 2.1 | 2.3 | 107 | 155 | |
| opt-level=1 | 84 | 4.2 | 5.3 | 8.1 | 1500 | 12 | |
| opt-level=2 | 97.5 | 4.2 | 6 | 9.2 | 1600 | 10.8 | |
| opt-level=3 | 97.7 | 4.2 | 6 | 9.3 | 1700 | 11.1 | |
| REF1 = opt-level = 3 +lto = "thin" +debug = false +debug-assertions=falsestrip = "symbols" +panic = "abort" +codegen-units = 1 +split-debuginfo = "off"+ Ctarget-cpu=native+ Zthreads=8 | 1.6 | 23 | 10.9 | (release profile) | |||
| REF1 + Zshare-generics=y | 1.6 | 21 | 11.1 | ||||
| opt-level = 0 +lto = "off" +debug = line-tables-only +strip = false +panic = "unwind" +codegen-units = 16 +split-debuginfo = "unpacked"+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y | 3.2 | 186 | 163 | (dev profile) | |||
| REF2 = opt-level = 0debug = falsedebug-assertions=falsesplit-debuginfo = 'off'strip = falselto = "off"panic = 'abort'codegen-units = 16+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y | 29.34 | 2 | 84 | 117.6 | |||
| REF2 + debug-assertions=false | 29.7 | 2.1 | 88 | 145.2 | |||
| REF2 + mold | 25.74 | 2.1 | 116 | 118 | |||
| REF2 + wild | 24.93 | 2 | 85 | 116.8 | |||
| REF2 + cranelift | 29.5 | 2.7 | 376 | 195 | |||
| REF2 + cranelift + wild | 24.7 | 2.7 | 366 | 196 |
Nightly 1.95 (and 1.96)
| nightly | time (secs) | bin size (mo) | Disk deps (go) | Disk deps (go)main | main bin size (mo) | exec speed(ms) | Notes |
|---|---|---|---|---|---|---|---|
| REFERENCE | 86 | 4.2 | 5.2 | - | - | - | A lot longer than stable! |
| codegen-units=1 | |||||||
| Zthreads=1 | 86 | 4.2 | 5.2 | - | - | - | Seems to be the default value |
| Zthreads=8 | 66 | 4.2 | 5.2 | - | - | - | Sweet spot.From there : the higher the value, the longer it took. |
| Zthreads=16 | 86 | 4.2 | 5.2 | - | - | - | 9950X has 16 cores |
| Zthreads=32 | 95 | 4.2 | 5.2 | - | - | - | 9950X has 32 threads |
| Zshare-generics=y | 86 | 4.2 | 5.2 | - | - | - | No effect on REFERENCE configuration. |
| Ctarget-cpu=native | 86 | 4.2 | 7.6 | 7.6 | 1.2 | 166 | |
| debug=true | 94 | 7.5 | 1200 | 157.3 | |||
| debug=false | 79.8 | 0.44 | 2.1 | 2.3 | 103 | 155 | |
| opt-level=1 | 116.7 | 4.2 | 5.3 | 8.1 | 1500 | 12 | |
| opt-level=2 | 128 | 4.2 | 6 | 9.2 | 1600 | 10.8 | |
| opt-level=3 | 129.7 | 4.2 | 6 | 9.3 | 1700 | 11.1 | |
| REF1 = opt-level = 3 +lto = "thin" +debug = false +strip = "symbols" +panic = "abort" +codegen-units = 1 +split-debuginfo = "off"+ Ctarget-cpu=native+ Zthreads=8 | 123 | 1.6 | 23 | 10.9 | (release profile) | ||
| REF1 + Zshare-generics=y | 106.5 | 1.6 | 21 | 11.1 | |||
| opt-level = 0 +lto = "off" +debug = line-tables-only +strip = false +panic = "unwind" +codegen-units = 16 +split-debuginfo = "unpacked"+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y | 65 | 3.2 | 186 | 163 | (dev profile) | ||
| REF2 = opt-level = 0debug = falsesplit-debuginfo = 'off'strip = falselto = "off"panic = 'abort'codegen-units = 16+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y | 60.1 | 2.1 | 87 | 145.4 | (dev profile best of both worlds) | ||
| REF2 + mold | 56.3 | 2.2 | 121 | 146.8 | |||
| REF2 + wild | 56.2 | 2.1 | 88 | 145.8 | |||
| REF2 + cranelift | 61.7 | 2.8 | 402 | 195 | |||
| REF2 + cranelift + wild | 55.18 | 2.8 | 387 | 196 |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-trait-systemArea: Trait systemArea: Trait systemC-bugCategory: This is a bug.Category: This is a bug.E-needs-mcveCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleI-compiletimeIssue: Problems and improvements with respect to compile times.Issue: Problems and improvements with respect to compile times.P-highHigh priorityHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.T-typesRelevant to the types team, which will review and decide on the PR/issue.Relevant to the types team, which will review and decide on the PR/issue.regression-from-stable-to-betaPerformance or correctness regression from stable to beta.Performance or correctness regression from stable to beta.
Type
Fields
Give feedbackNo fields configured for issues without a type.