[BOLT][aarch64]Performance regression on AArch64 Clang after BOLT optimization using low-coverage instrumentation profile

## TL;DR
We applied BOLT with an instrumentation profile to a bootstrapped AArch64 Clang build and observed a consistent performance regression (~260s → ~280s, ≈7.7%) despite very positive Dyno statistics (e.g., taken branches −66.4%). Root cause: the instrumentation profile only covered **1.7%** of functions, which caused BOLT to make globally harmful layout decisions while producing strong local improvements.

## Environment
- OS: Ubuntu 22.04
- Arch: AArch64
- LLVM/BOLT: 22
- Binary: bootstrapped Release Clang (single monolithic clang binary used to bootstrap)
- Workload: full `ninja clang` build (the instrumented compiler was used to drive profile collection)

## Repro summary
1. Build baseline clang (Release).
2. Instrument baseline with `llvm-bolt -instrument` → produce instrumented clang.
3. Use the instrumented clang to run a full `ninja clang` build that generates `.fdata`.
4. Run `llvm-bolt` on the baseline clang with the generated `.fdata`.
5. Measure end-to-end `ninja clang` runtime or the same benchmark used for baseline.

## Observed result
- Baseline Clang: **~260 s**
- BOLT-optimized clang.bolt: **~280 s**  (regression ≈ **7.7%**)

BOLT reported excellent dyno stats (e.g., taken branches −66.4%), but the profile coverage was tiny:
BOLT-INFO: 2376 out of 142805 functions in the binary (1.7%) have non-empty execution profile


## questions 
1. Has anyone seen similar regressions caused by sparse instrumentation profiles (especially on AArch64)?
2. Would it be useful for BOLT to warn when function coverage is below a threshold (e.g., <5%) before large global reorders?
3. Recommended best practices for generating robust instrumentation profiles for multi-process builds (ninja), or any scripts/recipes you can share?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BOLT][aarch64]Performance regression on AArch64 Clang after BOLT optimization using low-coverage instrumentation profile #156177

TL;DR

Environment

Repro summary

Observed result

questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BOLT][aarch64]Performance regression on AArch64 Clang after BOLT optimization using low-coverage instrumentation profile #156177

Description

TL;DR

Environment

Repro summary

Observed result

questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions