Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion OmaxLTO.cfg
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
-flto=full \
-fvirtual-function-elimination \
-fwhole-program-vtables
-fwhole-program-vtables \
-mllvm -extra-LTO-loop-unroll=true
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@ and/or increased memory usage during linking. Some of the options in the config
corresponding optimisation passes in the [LLVM project](https://github.com/llvm/llvm-project)
to find out more. Users are also encouraged to create their own configs and tune their own
flag parameters.
Information on LLVM Embedded Toolchain for Arm specific optimization flags is available in [Optimization Flags](https://github.com/ARM-software/LLVM-embedded-toolchain-for-Arm/blob/main/docs/optimization-flags.md)

Binary releases of the LLVM Embedded Toolchain for Arm are based on release
branches of the upstream LLVM Project, thus can safely be used with all tools
Expand Down
9 changes: 9 additions & 0 deletions docs/optimization-flags.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Additional optimization flags
=============================

## Additional loop unroll in the LTO pipeline
In some cases it is benefitial to perform an additional loop unroll pass so that extra information becomes available to later passes, e.g. SROA.
Use cases where this could be beneficial - multiple (N>=4) nested loops.

### Usage:
-mllvm -extra-LTO-loop-unroll=true/false
55 changes: 55 additions & 0 deletions patches/llvm-project-perf/0000-LTOpasses-add-loop-unroll.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
From 4adfc5231d2c0182d6278b4aa75eec57648e5dd4 Mon Sep 17 00:00:00 2001
From: Vladi Krapp <[email protected]>
Date: Tue, 3 Sep 2024 14:12:48 +0100
Subject: [Pipelines] Additional unrolling in LTO

Some workloads require specific sequences of events to happen
to fully simplify. This adds an extra full unrolling pass to help these
cases on the cores with branch predictors. It helps produce simplified
loops, which can then be SROA'd allowing further simplification, which
can be important for performance.
This is added under own flag - spending extra compile time to get extra
performance on specific user request.

Originally patch by David Green ([email protected])
---
llvm/lib/Passes/PassBuilderPipelines.cpp | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 1184123c7710..6dc45d85927a 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -332,6 +332,10 @@ namespace llvm {
extern cl::opt<unsigned> MaxDevirtIterations;
} // namespace llvm

+static cl::opt<bool> LTOExtraLoopUnroll(
+ "extra-LTO-loop-unroll", cl::init(false), cl::Hidden,
+ cl::desc("Perform extra loop unrolling pass to assist SROA"));
+
void PassBuilder::invokePeepholeEPCallbacks(FunctionPassManager &FPM,
OptimizationLevel Level) {
for (auto &C : PeepholeEPCallbacks)
@@ -1940,6 +1944,18 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level,
MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(ArgumentPromotionPass()));

FunctionPassManager FPM;
+
+ if (LTOExtraLoopUnroll) {
+ LoopPassManager OmaxLPM;
+ OmaxLPM.addPass(LoopFullUnrollPass(Level.getSpeedupLevel(),
+ /* OnlyWhenForced= */ !PTO.LoopUnrolling,
+ PTO.ForgetAllSCEVInLoopUnroll));
+ FPM.addPass(
+ createFunctionToLoopPassAdaptor(std::move(OmaxLPM),
+ /*UseMemorySSA=*/false,
+ /*UseBlockFrequencyInfo=*/true));
+ }
+
// The IPO Passes may leave cruft around. Clean up after them.
FPM.addPass(InstCombinePass());
invokePeepholeEPCallbacks(FPM, Level);
--
2.34.1

22 changes: 0 additions & 22 deletions patches/llvm-project-perf/0000-Placeholder-commit.patch

This file was deleted.

Loading