Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion bolt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ $ perf2bolt -p perf.data -o perf.fdata <executable>
This command will aggregate branch data from `perf.data` and store it in a
format that is both more compact and more resilient to binary modifications.

If the profile was collected without brstacks, you will need to add `-nl` flag to
If the profile was collected without brstacks, you will need to add `-ba` flag to
the command line above.

### Step 3: Optimize with BOLT
Expand Down
2 changes: 1 addition & 1 deletion bolt/docs/Heatmaps.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ $ perf record -e cycles:u -j any,u [-p PID|-a] -- sleep <interval>
```

Running with brstack (`-j any,u` or `-b`) is recommended. Heatmaps can be generated
from basic events by using the llvm-bolt-heatmap option `-nl` (no brstack) but
from basic events by using the llvm-bolt-heatmap option `-ba` (basic events) but
such heatmaps do not have the coverage provided by brstack and may only be useful
for finding event hotspots at larger code block granularities.

Expand Down
4 changes: 2 additions & 2 deletions bolt/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,8 +205,8 @@ This command will aggregate branch data from ``perf.data`` and store it
in a format that is both more compact and more resilient to binary
modifications.

If the profile was collected without LBRs, you will need to add ``-nl``
flag to the command line above.
If the profile was collected without brstacks, you will need to add `-ba` flag to
the command line above.

Step 3: Optimize with BOLT
~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
23 changes: 18 additions & 5 deletions bolt/lib/Profile/DataAggregator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,23 @@ using namespace bolt;
namespace opts {

static cl::opt<bool>
BasicAggregation("nl",
cl::desc("aggregate basic samples (without brstack info)"),
BasicAggregation("basic-events",
cl::desc("aggregate basic events (without brstack info)"),
cl::cat(AggregatorCategory));

static cl::alias BasicAggregationAlias("ba",
cl::desc("Alias for --basic-events"),
cl::aliasopt(BasicAggregation));

static cl::opt<bool> DeprecatedBasicAggregationNl(
"nl", cl::desc("Alias for --basic-events (deprecated. Use --ba)"),
cl::cat(AggregatorCategory), cl::ReallyHidden,
cl::callback([](const bool &Enabled) {
errs()
<< "BOLT-WARNING: '-nl' is deprecated, please use '--ba' instead.\n";
BasicAggregation = Enabled;
}));

cl::opt<bool> ArmSPE("spe", cl::desc("Enable Arm SPE mode."),
cl::cat(AggregatorCategory));

Expand Down Expand Up @@ -1433,7 +1446,7 @@ std::error_code DataAggregator::printLBRHeatMap() {
"Cannot build heatmap.";
} else {
errs() << "HEATMAP-ERROR: no brstack traces detected in profile. "
"Cannot build heatmap. Use -nl for building heatmap from "
"Cannot build heatmap. Use -ba for building heatmap from "
"basic events.\n";
}
exit(1);
Expand Down Expand Up @@ -1629,8 +1642,8 @@ std::error_code DataAggregator::parseBranchEvents() {
<< "PERF2BOLT-WARNING: all recorded samples for this binary lack "
"brstack. Record profile with perf record -j any or run "
"perf2bolt "
"in non-brstack mode with -nl (the performance improvement in "
"-nl "
"in non-brstack mode with -ba (the performance improvement in "
"-ba "
"mode may be limited)\n";
else
errs()
Expand Down
2 changes: 1 addition & 1 deletion bolt/test/X86/nolbr.s
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# RUN: FileCheck %s --input-file %t.fdata --check-prefix=CHECK-FDATA
# RUN: llvm-strip --strip-unneeded %t.o
# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q -nostdlib
# RUN: llvm-bolt %t.exe -o %t.out --data %t.fdata --dyno-stats -nl \
# RUN: llvm-bolt %t.exe -o %t.out --data %t.fdata --dyno-stats -ba \
# RUN: --print-only=_start 2>&1 | FileCheck %s --check-prefix=CHECK-BOLT

# CHECK-FDATA: no_lbr
Expand Down
6 changes: 3 additions & 3 deletions bolt/test/X86/pre-aggregated-perf.test
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,11 @@ RUN: FileCheck %s -check-prefix=NEWFORMAT --input-file %t.bolt.yaml
RUN: perf2bolt %t.exe -o %t --pa -p %p/Inputs/pre-aggregated-basic.txt -o %t.ba \
RUN: 2>&1 | FileCheck %s --check-prefix=BASIC-ERROR
RUN: perf2bolt %t.exe -o %t --pa -p %p/Inputs/pre-aggregated-basic.txt -o %t.ba.nl \
RUN: -nl 2>&1 | FileCheck %s --check-prefix=BASIC-SUCCESS
RUN: FileCheck %s --input-file %t.ba.nl --check-prefix CHECK-BASIC-NL
RUN: -ba 2>&1 | FileCheck %s --check-prefix=BASIC-SUCCESS
RUN: FileCheck %s --input-file %t.ba.nl --check-prefix CHECK-BASIC-BA
BASIC-ERROR: BOLT-INFO: 0 out of 7 functions in the binary (0.0%) have non-empty execution profile
BASIC-SUCCESS: BOLT-INFO: 4 out of 7 functions in the binary (57.1%) have non-empty execution profile
CHECK-BASIC-NL: no_lbr cycles
CHECK-BASIC-BA: no_lbr cycles

PERF2BOLT: 1 frame_dummy/1 1e 1 frame_dummy/1 0 0 1
PERF2BOLT-NEXT: 1 main 451 1 SolveCubic 0 0 2
Expand Down
4 changes: 2 additions & 2 deletions bolt/test/perf2bolt/perf_test.test
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ REQUIRES: system-linux, perf

RUN: %clang %S/Inputs/perf_test.c -fuse-ld=lld -pie -Wl,--script=%S/Inputs/perf_test.lds -o %t
RUN: perf record -Fmax -e cycles:u -o %t2 -- %t
RUN: perf2bolt %t -p=%t2 -o %t3 -nl -ignore-build-id --show-density \
RUN: perf2bolt %t -p=%t2 -o %t3 -ba -ignore-build-id --show-density \
RUN: --heatmap %t.hm 2>&1 | FileCheck %s
RUN: FileCheck %s --input-file %t.hm-section-hotness.csv --check-prefix=CHECK-HM

Expand All @@ -15,7 +15,7 @@ CHECK: BOLT-INFO: Functions with density >= {{.*}} account for 99.00% total samp

RUN: %clang %S/Inputs/perf_test.c -no-pie -fuse-ld=lld -o %t4
RUN: perf record -Fmax -e cycles:u -o %t5 -- %t4
RUN: perf2bolt %t4 -p=%t5 -o %t6 -nl -ignore-build-id --show-density \
RUN: perf2bolt %t4 -p=%t5 -o %t6 -ba -ignore-build-id --show-density \
RUN: --heatmap %t.hm2 2>&1 | FileCheck %s
RUN: FileCheck %s --input-file %t.hm2-section-hotness.csv --check-prefix=CHECK-HM

Expand Down
4 changes: 2 additions & 2 deletions clang/utils/perf-training/perf-helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ def perf2bolt(args):
"--profile-format=yaml",
]
if not opts.lbr:
p2b_args += ["-nl"]
p2b_args += ["-ba"]
p2b_args += ["-p"]
for filename in findFilesWithExtension(opts.path, "perf.data"):
subprocess.check_call(p2b_args + [filename, "-o", filename + ".fdata"])
Expand Down Expand Up @@ -722,7 +722,7 @@ def bolt_optimize(args):
"-dyno-stats",
"-use-gnu-stack",
"-update-debug-sections",
"-nl" if opts.method == "PERF" else "",
"-ba" if opts.method == "PERF" else "",
]
print("Running: " + " ".join(args))
process = subprocess.run(
Expand Down