Skip to content

Significant compilation time regression starting in v1.95.0-nightly (bevy) #153910

@gpoblon

Description

@gpoblon

Summary

Compilation time regression from 1.95 nightly onward.
1.94 nightly (nightly-2026-01-15) is not affected.

EDIT: this issue affects both LLVm and cranelift backends.

Command used

cargo build in a fresh (cargo init + cargo add bevy@0.18.1) bevy repository. Nothing fancy.

Expected vs Actual behaviour

  • 1.94 stable cold compilation time: 48.5 seconds.
  • 1.94 nightly cold compilation time: 46.9 seconds.
  • 1.95 nightly cold compilation time: 85.8 seconds.
  • 1.96 nightly cold compilation time: 86 seconds.

Configuration

[unstable] [build], [target] are empty. No env / local crate override. Built with time cargo build --profile x.

My .cargo/config.toml:

[profile.x]
inherits = "dev"
debug-assertions = true
incremental = false
# default, will change
opt-level = 0
debug = true
split-debuginfo = 'off'
strip = "none"
lto = false
panic = 'unwind'
codegen-units = 256

Afterwards, I toggled these options one by one (+ cranelift + Ctarget-cpu + Zthreads + Zshare-generics + linker + mold + wild) and results are consistent: compilation time roughly doubles in all scenarios, it does not seem to be related to an option.

Operating system

EndeavourOS (up to date)

HEAD

(version 1.94 nightly)

rustup toolchain install nightly-2026-01-15
rustup override set nightly-2026-01-15

(version 1.95 nightly)

rustup toolchain install nightly-2026-02-25
rustup override set nightly-2026-02-25

Additional context

Build comparison table

Stable 1.94

stable time (secs) bin size (mo) Disk deps (go) Disk deps (go) w/ loop bin size (mo) w/ loop exec speed(ms)w/loop Notes
REFERENCE 48.5 3.8 5.6 6.9 1100 158 The reference for all tests
codegen-units=1 53.3 3.8 4 4.7 784 158  
codegen-units=16 46.7 3.8 4.4 5.5 887 158 Sweet spot.
codegen-units=32 50 3.8 4 6.7 939 158  
Ctarget-cpu=native 48.7 3.8 6.9 6.9 1100 160  
debug-assertions=false 47.7 - - 6.8 1100 141.6  
debug-assertions=false + debug=false 42.5 - - 2.3 87 130  
debug=false 42.5 0.43 1.8 2.3 91 152  
debug="line-tables-only 43.6 0.43 1.9 3 271 152.4 exec speed anomaly
split-debuginfo="unpacked" 48.25 - - 5.8 221 158  
opt-level=1 83.7 3.8 5.1 7.7 1400 11.9  
opt-level=2 95.8 3.8 5.7 8.7 1500 11.2  
opt-level=3 96.4 3.8 5.8 8.9 1600 10.8  
opt-level=3 + codegen-units=1 148 - - 5.1 945 10.8 Affects compilation time
opt-level=3 + codegen-units=16 92 - - 7.3 1300 10.8  
opt-level="s" 78.9 3.8 4.4 6.5 1100 14.2  
lto="fat" (46 / ) 897 4.4 4.4 9.9 791 158 Takes forever to build local crate
lto="thin" (50 / ) 62.4 2.3 5.3 5 312 158 Big local crates make compilation very long.
lto="thin" + codegen-units=16 60 - - 4.5 255 158 Big local crates make compilation very long.
panic="abort" 37 / 51.95 3.8 5.5 7.6 1100 153 Surprisingly, improves performance
strip="debuginfo" 47.7 0.43 5.5 5 90 158  
strip="symbols" 47.8 0.345 4.8 4.9 55 158.4  
REF1 = opt-level = 3 +lto = "thin" +debug = false +strip = "symbols" +panic = "abort" +codegen-units = 1 +split-debuginfo = "off" +Ctarget-cpu=native 103     1.5 23 11.2 BiS for release builds
REF1 but codegen-units = 16 72.8     1.7 37 11.1  
REF1 + built from RAM (build.target-dir = "/tmp/cargo-target") 103     1.5 23 11.2 df /tmp outputs tmpfs which indicates /tmp is in my RAM.No difference here means my SSD is not a bottleneck.
REF1 + mold linker 103     1.5 23 11.2 unaffected
REF1 + wild linker 103     1.5 23 11.2 unaffected

Nightly 1.94

nightly time (secs) bin size (mo) Disk deps (go) Disk deps (go)main main bin size (mo) exec speed(ms) Notes
REFERENCE 46.9     6.8 1.1 158  
codegen-units=1 52.9     5.2 912 157  
Zthreads=1 46.9 4.2 5.2 7.6 1200 158 Seems to be the default value
Zthreads=8 33.8 4.2 5.2 7.6 1200 158 != 1.95: here sweet spot in 16
Zthreads=8 + codegen-units=16 31.4 4.2 5.2 7.6 1200 158  
Zthreads=16 33.4 4.2 5.2 7.6 1200 158 9950X has 16 cores
Zthreads=32 40 4.2 5.2 7.6 1200 158 9950X has 32 threads
Zshare-generics=y 48.6 4.2 5.2 6.1 1000 - No effect on REFERENCE configuration.
Ctarget-cpu=native 49.2 4.2 7.6 7.6 1.2 166  
debug=false 43.15 0.44 2.1 2.3 107 155  
opt-level=1 84 4.2 5.3 8.1 1500 12  
opt-level=2 97.5 4.2 6 9.2 1600 10.8  
opt-level=3 97.7 4.2 6 9.3 1700 11.1  
REF1 = opt-level = 3 +lto = "thin" +debug = false +debug-assertions=falsestrip = "symbols" +panic = "abort" +codegen-units = 1 +split-debuginfo = "off"+ Ctarget-cpu=native+ Zthreads=8       1.6 23 10.9 (release profile)
REF1 + Zshare-generics=y       1.6 21 11.1  
opt-level = 0 +lto = "off" +debug = line-tables-only +strip = false +panic = "unwind" +codegen-units = 16 +split-debuginfo = "unpacked"+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y       3.2 186 163 (dev profile)
REF2 = opt-level = 0debug = falsedebug-assertions=falsesplit-debuginfo = 'off'strip = falselto = "off"panic = 'abort'codegen-units = 16+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y 29.34     2 84 117.6  
REF2 + debug-assertions=false 29.7     2.1 88 145.2  
REF2 + mold 25.74     2.1 116 118  
REF2 + wild 24.93     2 85 116.8  
REF2 + cranelift 29.5     2.7 376 195  
REF2 + cranelift + wild 24.7     2.7 366 196  

Nightly 1.95 (and 1.96)

nightly time (secs) bin size (mo) Disk deps (go) Disk deps (go)main main bin size (mo) exec speed(ms) Notes
REFERENCE 86 4.2 5.2 - - - A lot longer than stable!
codegen-units=1              
Zthreads=1 86 4.2 5.2 - - - Seems to be the default value
Zthreads=8 66 4.2 5.2 - - - Sweet spot.From there : the higher the value, the longer it took.
Zthreads=16 86 4.2 5.2 - - - 9950X has 16 cores
Zthreads=32 95 4.2 5.2 - - - 9950X has 32 threads
Zshare-generics=y 86 4.2 5.2 - - - No effect on REFERENCE configuration.
Ctarget-cpu=native 86 4.2 7.6 7.6 1.2 166  
debug=true 94     7.5 1200 157.3  
debug=false 79.8 0.44 2.1 2.3 103 155  
opt-level=1 116.7 4.2 5.3 8.1 1500 12  
opt-level=2 128 4.2 6 9.2 1600 10.8  
opt-level=3 129.7 4.2 6 9.3 1700 11.1  
REF1 = opt-level = 3 +lto = "thin" +debug = false +strip = "symbols" +panic = "abort" +codegen-units = 1 +split-debuginfo = "off"+ Ctarget-cpu=native+ Zthreads=8 123     1.6 23 10.9 (release profile)
REF1 + Zshare-generics=y 106.5     1.6 21 11.1  
opt-level = 0 +lto = "off" +debug = line-tables-only +strip = false +panic = "unwind" +codegen-units = 16 +split-debuginfo = "unpacked"+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y 65     3.2 186 163 (dev profile)
REF2 = opt-level = 0debug = falsesplit-debuginfo = 'off'strip = falselto = "off"panic = 'abort'codegen-units = 16+ Ctarget-cpu=native+ Zthreads=8+ Zshare-generics=y 60.1     2.1 87 145.4 (dev profile best of both worlds)
REF2 + mold 56.3     2.2 121 146.8  
REF2 + wild 56.2     2.1 88 145.8  
REF2 + cranelift 61.7     2.8 402 195  
REF2 + cranelift + wild 55.18     2.8 387 196  

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-trait-systemArea: Trait systemC-bugCategory: This is a bug.E-needs-mcveCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleI-compiletimeIssue: Problems and improvements with respect to compile times.P-highHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-typesRelevant to the types team, which will review and decide on the PR/issue.regression-from-stable-to-betaPerformance or correctness regression from stable to beta.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions