[AMDGPU] Insert inliner anchor earlier #169478

rovka · 2025-11-25T09:58:37Z

Add a new hook for inserting passes right after the last DummyCGSCC pass
and use it to insert the anchor. This changes the last FunctionPass
manager to be an inlining pass manager, thus preserving some of the
analyses that might be computed before the inliner and used after it (to
be fair that's never going to be a lot of analyses, since inlining is
pretty plastic, but at least some of the IR-level analyses that have
absolutely no reason to change can be computed only once).

This is how I originally designed the code, but I don't feel like I have
a good name/abstraction for this exact point in the pipeline, hence the
separate patch.

Add a new hook for inserting passes right after the last DummyCGSCC pass and use it to insert the anchor. This changes the last FunctionPass manager to be an inlining pass manager, thus preserving some of the analyses that might be computed before the inliner and used after it (to be fair that's never going to be a lot of analyses, since inlining is pretty plastic, but at least some of the IR-level analyses that have absolutely no reason to change can be computed only once). This is how I originally designed the code, but I don't feel like I have a good name/abstraction for this exact point in the pipeline, hence the separate patch.

rovka · 2025-11-25T09:58:50Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-11-25T10:13:07Z

@llvm/pr-subscribers-backend-amdgpu

Author: Diana Picus (rovka)

Changes

Add a new hook for inserting passes right after the last DummyCGSCC pass
and use it to insert the anchor. This changes the last FunctionPass
manager to be an inlining pass manager, thus preserving some of the
analyses that might be computed before the inliner and used after it (to
be fair that's never going to be a lot of analyses, since inlining is
pretty plastic, but at least some of the IR-level analyses that have
absolutely no reason to change can be computed only once).

This is how I originally designed the code, but I don't feel like I have
a good name/abstraction for this exact point in the pipeline, hence the
separate patch.

Full diff: https://github.com/llvm/llvm-project/pull/169478.diff

4 Files Affected:

(modified) llvm/include/llvm/CodeGen/TargetPassConfig.h (+9)
(modified) llvm/lib/CodeGen/TargetPassConfig.cpp (+3-1)
(modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+6)
(modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (+2-5)

diff --git a/llvm/include/llvm/CodeGen/TargetPassConfig.h b/llvm/include/llvm/CodeGen/TargetPassConfig.h
index 5e0e641a981f9..e8c85246b0049 100644
--- a/llvm/include/llvm/CodeGen/TargetPassConfig.h
+++ b/llvm/include/llvm/CodeGen/TargetPassConfig.h
@@ -364,6 +364,15 @@ class LLVM_ABI TargetPassConfig : public ImmutablePass {
     return true;
   }
 
+  /// Add passes at the start of the function pass manager created after
+  /// enforcing CGSCC ordering. This hook is only called when
+  /// requiresCodeGenSCCOrder() returns true.
+  ///
+  /// Targets can use this to insert passes that need to run at the beginning
+  /// of the codegen function pass manager, after the CGSCC boundary has been
+  /// established.
+  virtual void addInitialCGSCCCodeGenPasses() {}
+
   /// addMachineSSAOptimization - Add standard passes that optimize machine
   /// instructions in SSA form.
   virtual void addMachineSSAOptimization();
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index ceae0d29eea90..40458407b733d 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -966,8 +966,10 @@ void TargetPassConfig::addISelPrepare() {
   addPreISel();
 
   // Force codegen to run according to the callgraph.
-  if (requiresCodeGenSCCOrder())
+  if (requiresCodeGenSCCOrder()) {
     addPass(new DummyCGSCCPass);
+    addInitialCGSCCCodeGenPasses();
+  }
 
   if (getOptLevel() != CodeGenOptLevel::None)
     addPass(createObjCARCContractPass());
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 60d495d809236..f49f817d80c8d 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1245,6 +1245,7 @@ class GCNPassConfig final : public AMDGPUPassConfig {
   }
 
   bool addPreISel() override;
+  void addInitialCGSCCCodeGenPasses() override;
   void addMachineSSAOptimization() override;
   bool addILPOpts() override;
   bool addInstSelector() override;
@@ -1504,6 +1505,11 @@ bool GCNPassConfig::addPreISel() {
   return false;
 }
 
+void GCNPassConfig::addInitialCGSCCCodeGenPasses() {
+  if (EnableAMDGPUMachineLevelInliner)
+    addPass(createAMDGPUInliningAnchorPass());
+}
+
 void GCNPassConfig::addMachineSSAOptimization() {
   TargetPassConfig::addMachineSSAOptimization();
 
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index 8d7409c11537f..01a7c51bfcc40 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -1614,7 +1614,8 @@
 ; INLINER-NEXT:        Loop-Closed SSA Form Pass
 ; INLINER-NEXT:      Analysis if a function is memory bound
 ; INLINER-NEXT:      DummyCGSCCPass
-; INLINER-NEXT:      FunctionPass Manager
+; INLINER-NEXT:      AMDGPU Inlining Pass Manager
+; INLINER-NEXT:        AMDGPU Inlining Anchor
 ; INLINER-NEXT:        Dominator Tree Construction
 ; INLINER-NEXT:        Basic Alias Analysis (stateless AA impl)
 ; INLINER-NEXT:        Function Alias Analysis Results
@@ -1745,15 +1746,11 @@
 ; INLINER-NEXT:        Machine Copy Propagation Pass
 ; INLINER-NEXT:        Post-RA pseudo instruction expansion pass
 ; INLINER-NEXT:        SI Shrink Instructions
-; INLINER-NEXT:      AMDGPU Inlining Pass Manager
 ; INLINER-NEXT:        AMDGPU Inlining Anchor
 ; INLINER-NEXT:        AMDGPU Machine Level Inliner
 ; INLINER-NEXT:        SI post-RA bundler
 ; INLINER-NEXT:        MachineDominator Tree Construction
 ; INLINER-NEXT:        Machine Natural Loop Construction
-; INLINER-NEXT:        Dominator Tree Construction
-; INLINER-NEXT:        Basic Alias Analysis (stateless AA impl)
-; INLINER-NEXT:        Function Alias Analysis Results
 ; INLINER-NEXT:        PostRA Machine Instruction Scheduler
 ; INLINER-NEXT:        Machine Block Frequency Analysis
 ; INLINER-NEXT:        MachinePostDominator Tree Construction

github-actions · 2025-11-25T10:35:20Z

🐧 Linux x64 Test Results

166333 tests passed
2871 tests skipped
2 tests failed

Failed Tests

(click on a test name to see its output)

LLVM

LLVM.CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll

Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc < /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll -mtriple=arm64-apple-ios7.0 -pass-remarks-analysis=asm-printer        --debugify-and-strip-all-safe=0        -verify-machineinstrs        -pass-remarks-with-hotness=1 -asm-verbose=0        -debug-only=lazy-machine-block-freq,block-freq        -debug-pass=Executions 2>&1 | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll -check-prefix=HOTNESS
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=arm64-apple-ios7.0 -pass-remarks-analysis=asm-printer --debugify-and-strip-all-safe=0 -verify-machineinstrs -pass-remarks-with-hotness=1 -asm-verbose=0 -debug-only=lazy-machine-block-freq,block-freq -debug-pass=Executions
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll -check-prefix=HOTNESS
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll:34:17: error: HOTNESS-NEXT: expected string not found in input
# | ; HOTNESS-NEXT: Executing Pass 'Function Pass Manager'
# |                 ^
# | <stdin>:1058:75: note: scanning from here
# | [2025-11-25 10:09:41.008659864] 0x1c3bcdd0 Freeing Pass 'Machine Outliner' on Module '<stdin>'...
# |                                                                           ^
# | <stdin>:1059:44: note: possible intended match here
# | [2025-11-25 10:09:41.008673834] 0x1c3bcdd0 Executing Pass 'FunctionPass Manager' on Module '<stdin>'...
# |                                            ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            .
# |            .
# |            .
# |         1053: [2025-11-25 10:09:41.008573474] 0x1c3bcdd0 Freeing Pass 'Type-Based Alias Analysis' on Module '<stdin>'... 
# |         1054: [2025-11-25 10:09:41.008587554] 0x1c3bcdd0 Freeing Pass 'Target Transform Information' on Module '<stdin>'... 
# |         1055: [2025-11-25 10:09:41.008601354] 0x1c3bcdd0 Freeing Pass 'Target Library Information' on Module '<stdin>'... 
# |         1056: [2025-11-25 10:09:41.008614784] 0x1c3bcdd0 Freeing Pass 'Default Regalloc Priority Advisor' on Module '<stdin>'... 
# |         1057: [2025-11-25 10:09:41.008629414] 0x1c3bcdd0 Executing Pass 'Machine Outliner' on Module '<stdin>'... 
# |         1058: [2025-11-25 10:09:41.008659864] 0x1c3bcdd0 Freeing Pass 'Machine Outliner' on Module '<stdin>'... 
# | next:34'0                                                                               X~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |         1059: [2025-11-25 10:09:41.008673834] 0x1c3bcdd0 Executing Pass 'FunctionPass Manager' on Module '<stdin>'... 
# | next:34'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:34'1                                                ?                                                             possible intended match
# |         1060: [2025-11-25 10:09:41.008687854] 0x1c406be0 Executing Pass 'Verify generated machine code' on Function 'empty_func'... 
# | next:34'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |         1061: [2025-11-25 10:09:41.008719504] 0x1c406be0 Freeing Pass 'Verify generated machine code' on Function 'empty_func'... 
# | next:34'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |         1062: [2025-11-25 10:09:41.008731464] 0x1c406be0 Executing Pass 'AArch64 sls hardening pass' on Function 'empty_func'... 
# | next:34'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |         1063: [2025-11-25 10:09:41.008747034] 0x1c406be0 Freeing Pass 'AArch64 sls hardening pass' on Function 'empty_func'... 
# | next:34'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |         1064: [2025-11-25 10:09:41.008759234] 0x1c406be0 Executing Pass 'Verify generated machine code' on Function 'empty_func'... 
# | next:34'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            .
# |            .
# |            .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

LLVM.tools/UpdateTestChecks/update_givaluetracking_test_checks/knownbits-const.test

Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
cp -f /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Inputs/const.mir /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir && '/usr/bin/python3' /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/utils/update_givaluetracking_test_checks.py --llc-binary /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc --version=1 /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir
# executed command: cp -f /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Inputs/const.mir /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir
# note: command had no output on stdout or stderr
# executed command: /usr/bin/python3 /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/utils/update_givaluetracking_test_checks.py --llc-binary /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc --version=1 /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir
# note: command had no output on stdout or stderr
# RUN: at line 3
diff -u /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Inputs/const.mir.expected /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir
# executed command: diff -u /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Inputs/const.mir.expected /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir
# .---command stdout------------
# | --- /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Inputs/const.mir.expected
# | +++ /home/gha/actions-runner/_work/llvm-project/llvm-project/build/test/tools/UpdateTestChecks/update_givaluetracking_test_checks/Output/knownbits-const.test.tmp.mir
# | @@ -5,9 +5,9 @@
# |  name:            Cst
# |  body:             |
# |    bb.1:
# | -  ; CHECK-LABEL: name: @Cst
# | -  ; CHECK-NEXT: %0:_ KnownBits:00000001 SignBits:7
# | -  ; CHECK-NEXT: %1:_ KnownBits:00000001 SignBits:7
# | +    ; CHECK-LABEL: name: @Cst
# | +    ; CHECK-NEXT: %0:_ KnownBits:00000001 SignBits:7
# | +    ; CHECK-NEXT: %1:_ KnownBits:00000001 SignBits:7
# |      %0:_(s8) = G_CONSTANT i8 1
# |      %1:_(s8) = COPY %0
# |  ...
# | @@ -16,11 +16,11 @@
# |  body:             |
# |    bb.1:
# |    ; Note: This comment should not be removed, the check lines below should be updated
# | -  ; CHECK-LABEL: name: @Test2
# | -  ; CHECK-NEXT: %1:_ KnownBits:???????????????????????????????? SignBits:1
# | -  ; CHECK-NEXT: %named:gpr64 KnownBits:00000000000000000000000000000000???????????????????????????????? SignBits:32
# | -  ; CHECK-NEXT: %3:_ KnownBits:???????????????????????????????? SignBits:1
# | -  ; CHECK-NEXT: %4:_ KnownBits:00000000000000000000000000000000 SignBits:32
# | +    ; CHECK-LABEL: name: @Test2
# | +    ; CHECK-NEXT: %1:_ KnownBits:???????????????????????????????? SignBits:1
# | +    ; CHECK-NEXT: %named:gpr64 KnownBits:00000000000000000000000000000000???????????????????????????????? SignBits:32
# | +    ; CHECK-NEXT: %3:_ KnownBits:???????????????????????????????? SignBits:1
# | +    ; CHECK-NEXT: %4:_ KnownBits:00000000000000000000000000000000 SignBits:32
# |      %0:gpr32 = COPY $w0
# |      %1:_(s32) = COPY %0
# |      %named:gpr64(s64) = G_ZEXT %1
# `-----------------------------
# error: command failed with exit status: 1

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

This was referenced Nov 25, 2025

Make legacy FPPassManager more inheritable #169475

Open

[AMDGPU] Add machine-level inliner pass #169476

Open

rovka mentioned this pull request Nov 25, 2025

[AMDGPU] Update machine frame info during inlining #169477

Open

rovka marked this pull request as ready for review November 25, 2025 10:12

rovka requested a review from arsenm November 25, 2025 10:12

llvmbot added backend:AMDGPU llvm:codegen labels Nov 25, 2025

rovka assigned jayfoad, nhaehnle, cmc-rep and piotrAMD Nov 25, 2025

rovka unassigned jayfoad, nhaehnle, cmc-rep and piotrAMD Nov 25, 2025

rovka requested review from cmc-rep, jayfoad, nhaehnle and piotrAMD November 25, 2025 10:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Insert inliner anchor earlier #169478

[AMDGPU] Insert inliner anchor earlier #169478

rovka commented Nov 25, 2025

Uh oh!

rovka commented Nov 25, 2025 •

edited

Loading

Uh oh!

llvmbot commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[AMDGPU] Insert inliner anchor earlier #169478

Are you sure you want to change the base?

[AMDGPU] Insert inliner anchor earlier #169478

Conversation

rovka commented Nov 25, 2025

Uh oh!

rovka commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

🐧 Linux x64 Test Results

Failed Tests

LLVM

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

rovka commented Nov 25, 2025 •

edited

Loading