Skip to content

Conversation

0xdeafbeef
Copy link
Contributor

hyperfine  --prepare 'cargo clean' 'cargo +stock build -r' 'cargo +stage2 build  -r'  --runs 3 --warmup 0
Benchmark 1: cargo +stock build -r
  Time (mean ± σ):     269.238 s ±  0.906 s    [User: 2343.174 s, System: 198.200 s]
  Range (min … max):   268.623 s … 270.278 s    3 runs

Benchmark 2: cargo +stage2 build  -r
  Time (mean ± σ):     246.692 s ±  0.094 s    [User: 2300.234 s, System: 165.301 s]
  Range (min … max):   246.617 s … 246.798 s    3 runs

Summary
  cargo +stage2 build  -r ran
    1.09 ± 0.00 times faster than cargo +stock build -r

 Command being timed: "cargo +stage2 build -r"
        User time (seconds): 2255.06
        System time (seconds): 167.08
        Percent of CPU this job got: 1007%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 4:00.39
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 5841344
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 272
        Minor (reclaiming a frame) page faults: 45702476
        Voluntary context switches: 177243
        Involuntary context switches: 531296
        Swaps: 0
        File system inputs: 26184
        File system outputs: 33975408
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
        
 Command being timed: "cargo +stock build -r"
        User time (seconds): 2288.81
        System time (seconds): 196.81
        Percent of CPU this job got: 950%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 4:21.50
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 9643212
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 268
        Minor (reclaiming a frame) page faults: 58525292
        Voluntary context switches: 207169
        Involuntary context switches: 525091
        Swaps: 0
        File system inputs: 9272
        File system outputs: 33879056
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
Metric +stage2 +stock Δ (stage2 – stock) Stage2 vs stock
Wall time 4:00.39 4:21.50 −21.11 s 8.07% faster
Total CPU time (user+sys) 2422.14 s 2485.62 s −63.48 s −2.55%
CPU utilization 1007% 950% +57 pp ~6% higher (avg 10.08 vs 9.51 cores)
Max RSS 5.57 GiB 9.20 GiB −3.63 GiB −39.4% mem
Minor page faults 45.7 M 58.5 M −12.8 M −21.9%
Major page faults 272 268 +4 ~same
Voluntary ctx switches 177,243 207,169 −29,926 −14.4%
Involuntary ctx switches 531,296 525,091 -6,205 -1.18%

I've compared with compiler built from master.

Interestingly, i've got millions of errors from gwp-asan like

        discovered for pointer 0xe507ffb15c0: this pointer was recently freed with a size argument in the range [1, 8], but the associated span of allocated memory is for allocations with sizes [9, 16]

until i've patched operator delete to use regular delete. All errors came from llvm

     Thread 17 "lto cgu.00" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe17f56c0 (LWP 393688)]
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at pthread_kill.c:44
44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
    no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007fffe8881f63 in __pthread_kill_internal (threadid=<optimized out>, signo=6)
    at pthread_kill.c:89
#2  0x00007fffe8827f3e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007fffe880f6d0 in __GI_abort () at abort.c:77
#4  0x0000555555587930 in tcmalloc::tcmalloc_internal::ReportMismatchedSizeClass(tcmalloc::tcmalloc_int
#5  0x0000555555584702 in tcmalloc::tcmalloc_internal::central_freelist_internal::StaticForwarder::MapO
#6  0x0000555555558892 in tcmalloc::tcmalloc_internal::central_freelist_internal::CentralFreeList<tcmal
#7  0x00005555555b5321 in tcmalloc::tcmalloc_internal::ThreadCache::ReleaseToTransferCache(tcmalloc::tc
#8  0x00005555555b5611 in tcmalloc::tcmalloc_internal::ThreadCache::ListTooLong(tcmalloc::tcmalloc_inte
#9  0x00005555555b572f in tcmalloc::tcmalloc_internal::ThreadCache::DeallocateSlow(void*, tcmalloc::tcmalloc_internal::ThreadCache::FreeList*, unsigned long) ()
#10 0x0000555555560bc2 in tcmalloc::tcmalloc_internal::FreeWithHooksOrPerThread(void*, unsigned long)
    ()
#11 0x00005555555be97b in operator delete(void*, unsigned long) ()
#12 0x00007ffff30cfeab in std::_Rb_tree<unsigned long, std::pair<unsigned long const, llvm::GlobalValueSummaryInfo>, std::_Select1st<std::pair<unsigned long const, llvm::GlobalValueSummaryInfo> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, llvm::GlobalValueSummaryInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, llvm::GlobalValueSummaryInfo> >*) [clone .isra.0] ()
   from /home/odm3n/dev/oss/rust/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/librustc_driver-9c7769cfa025b531.so
#13 0x00007ffff30d30bd in LLVMRustThinLTOData::~LLVMRustThinLTOData() ()
   from /home/odm3n/dev/oss/rust/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/librustc_driver-9c7769cfa025b531.so
#14 0x00007ffff30d3163 in LLVMRustFreeThinLTOData ()
   from /home/odm3n/dev/oss/rust/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/librustc_driver-9c7769cfa025b531.so
#15 0x00007ffff30bebf6 in <alloc::sync::Arc<rustc_codegen_ssa::back::lto::ThinShared<rustc_codegen_llvm::LlvmCodegenBackend>>>::drop_slow ()
   from /home/odm3n/dev/oss/rust/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/librustc_driver-9c7769cfa025b531.so
#16 0x00007ffff2f7a647 in <rustc_codegen_llvm::LlvmCodegenBackend as rustc_codegen_ssa::traits::write::WriteBackendMethods>::optimize_thin ()
   from /home/odm3n/dev/oss/rust/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/librustc_driver-9c7769cfa025b531.so
#17 0x00007ffff307d20e in std::sys::backtrace::__rust_begin_short_backtrace::<<rustc_codegen_llvm::LlvmCodegenBackend as rustc_codegen_ssa::traits::backend::ExtraBackendMethods>::spawn_named_thread<rustc_codegen_ssa::back::write::spawn_work<rustc_codegen_llvm::LlvmCodegenBackend>::{closure#0}, ()>::{closure#0}, ()> ()
   from /home/odm3n/dev/oss/rust/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/librustc_driver-9c7769cfa025b531.so

If it's interesting i can clean this all up, or mb we should run rust-perf test-suite before.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Oct 4, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@0xdeafbeef 0xdeafbeef marked this pull request as ready for review October 4, 2025 15:03
@mati865
Copy link
Member

mati865 commented Oct 4, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 4, 2025
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 4, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 4, 2025

💔 Test for 75748e2 failed: CI. Failed jobs:

@rust-log-analyzer

This comment has been minimized.

@mati865
Copy link
Member

mati865 commented Oct 4, 2025

error: the package 'rustc-main' does not contain this feature: jemalloc

I guess you could leave the feature as no-op just to get perf results.

@rustbot
Copy link
Collaborator

rustbot commented Oct 5, 2025

The Miri subtree was changed

cc @rust-lang/miri

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

@0xdeafbeef
Copy link
Contributor Author

error: the package 'rustc-main' does not contain this feature: jemalloc

I guess you could leave the feature as no-op just to get perf results.

fixed

@mati865
Copy link
Member

mati865 commented Oct 5, 2025

@bors try

rust-bors bot added a commit that referenced this pull request Oct 5, 2025
@rust-bors

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Oct 5, 2025

💔 Test for b64a13d failed: CI. Failed jobs:

@0xdeafbeef
Copy link
Contributor Author

@mati865 maybe fixed, somehow there is no errno constants used by tcmalloc when building with lto

@0xdeafbeef
Copy link
Contributor Author

@bors try

@rust-bors
Copy link

rust-bors bot commented Oct 5, 2025

@0xdeafbeef: 🔑 Insufficient privileges: not in try users

@mati865
Copy link
Member

mati865 commented Oct 5, 2025

Perhaps the host glibc is too old?
@bors try

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 5, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Oct 5, 2025

💔 Test for e3545de failed: CI. Failed jobs:

@0xdeafbeef
Copy link
Contributor Author

@mati865

builds locally with

cargo run --manifest-path src/ci/citool/Cargo.toml run-local dist-x86_64-linux

I hope it will run on ci. This job uses centos7. It ships needed headers, but somehow they are not found :)

https://github.com/0xdeafbeef/tcmalloc-better/blob/781a440a091933e35ecbfd082fc0393bea6481ad/libtcmalloc-sys/c_src/compat/glibc_errno_compat.h

@rust-log-analyzer
Copy link
Collaborator

The job tidy failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
extracting /checkout/obj/build/cache/llvm-d2acb427e424fd7d52377698046a06109e9777b4-true/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz to /checkout/obj/build/x86_64-unknown-linux-gnu/ci-llvm
[TIMING:end] compile::Sysroot { compiler: Compiler { stage: 0, host: x86_64-unknown-linux-gnu, forced_compiler: false }, force_recompile: false } -- 13.632
[TIMING:end] builder::Libdir { compiler: Compiler { stage: 0, host: x86_64-unknown-linux-gnu, forced_compiler: false }, target: x86_64-unknown-linux-gnu } -- 0.002
##[group]Building stage1 tidy (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Updating git repository `https://github.com/0xdeafbeef/tcmalloc-better.git`
    Updating git submodule `https://github.com/abseil/abseil-cpp.git`
    Updating git submodule `https://github.com/google/tcmalloc.git`
    Updating crates.io index
---
[TIMING:end] tool::Tidy { compiler: Compiler { stage: 0, host: x86_64-unknown-linux-gnu, forced_compiler: false }, target: x86_64-unknown-linux-gnu } -- 0.000
fmt check
fmt: checked 6455 files
tidy check
tidy [extdeps]: invalid source: "git+https://github.com/0xdeafbeef/tcmalloc-better.git?rev=781a440a0919#781a440a091933e35ecbfd082fc0393bea6481ad"
tidy [extdeps]: FAIL
tidy [deps]: proc-macro crate dependency `litrs` is not registered in `src/bootstrap/src/utils/proc_macro_deps.rs`
tidy [deps]: Run `./x.py test tidy --bless` to regenerate the list
tidy [deps]: could not find allowed package `tikv-jemalloc-sys`
Remove from PERMITTED_DEPENDENCIES list if it is no longer used.
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: git+https://github.com/0xdeafbeef/tcmalloc-better.git?rev=781a440a0919#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
tidy [deps]: Dependency for rustc-main not explicitly permitted: registry+https://github.com/rust-lang/crates.io-index#[email protected]
Go to `src/tools/tidy/src/deps.rs:326` for the list.
tidy [deps]: FAIL
tidy [rustdoc_json (src)]: `rustdoc-json-types` modified, checking format version
tidy: Skipping binary file check, read-only filesystem
removing old virtual environment
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10' and 'venv'
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10' and 'virtualenv'
---
info: ES-Check: there were no ES version matching errors!  🎉
typechecking javascript files
tidy: The following checks failed: deps, extdeps
Bootstrap failed while executing `test src/tools/tidy tidyselftest --extra-checks=py,cpp,js,spellcheck`
Command `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools-bin/rust-tidy /checkout /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo /checkout/obj/build 4 /node/bin/npm --extra-checks=py,cpp,js,spellcheck` failed with exit code 1
Created at: src/bootstrap/src/core/build_steps/tool.rs:1549:23
Executed at: src/bootstrap/src/core/build_steps/test.rs:1280:29

Command has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 0:03:21
  local time: Mon Oct  6 14:11:51 UTC 2025
  network time: Mon, 06 Oct 2025 14:11:51 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

@mati865
Copy link
Member

mati865 commented Oct 6, 2025

@bors try

rust-bors bot added a commit that referenced this pull request Oct 6, 2025
@rust-bors

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Oct 6, 2025

☀️ Try build successful (CI)
Build commit: ab402aa (ab402aa031ae3faa1662c0cf0fb13e9ee74b4c2e, parent: d2acb427e424fd7d52377698046a06109e9777b4)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ab402aa): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
5.5% [0.9%, 18.2%] 268
Regressions ❌
(secondary)
5.7% [0.1%, 14.2%] 311
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 5.5% [0.9%, 18.2%] 268

Max RSS (memory usage)

Results (primary 13.9%, secondary 15.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
14.1% [4.4%, 28.7%] 256
Regressions ❌
(secondary)
15.8% [2.9%, 40.5%] 311
Improvements ✅
(primary)
-9.6% [-10.6%, -8.5%] 2
Improvements ✅
(secondary)
-4.4% [-5.9%, -2.9%] 2
All ❌✅ (primary) 13.9% [-10.6%, 28.7%] 258

Cycles

Results (primary 0.2%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.9% [1.6%, 18.3%] 40
Regressions ❌
(secondary)
5.3% [1.8%, 17.5%] 91
Improvements ✅
(primary)
-3.4% [-5.1%, -2.0%] 52
Improvements ✅
(secondary)
-6.9% [-19.4%, -2.1%] 68
All ❌✅ (primary) 0.2% [-5.1%, 18.3%] 92

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 470.924s -> 470.995s (0.02%)
Artifact size: 388.38 MiB -> 388.14 MiB (-0.06%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 6, 2025
@mati865
Copy link
Member

mati865 commented Oct 6, 2025

Impressive wall time results but occupied by higher max RSS.

@Kobzol
Copy link
Member

Kobzol commented Oct 6, 2025

Pretty similar to what we saw with mimalloc, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants