[CodeGen] Ignore requiresStructuredCFG check in canSplitCriticalEdge if successor is loop header #154063

wenju-he · 2025-08-18T05:19:22Z

This addresses a performance issue for our downstream GPU target that
sets requiresStructuredCFG to true. The issue is that EarlyMachineLICM
pass does not hoist loop invariants because a critical edge is not split.
The critical edge's destination a loop header. Splitting the critical
edge will not break structured CFG.

Add a nvptx test to demonstrate the issue since the target also
requires structured CFG.

…if successor is loop header This addresses a performance issue for our downstream GPU target that sets requiresStructuredCFG to true. The issue is that EarlyMachineLICM pass does not hoist loop invariants because a critical edge is not split.

llvmbot · 2025-09-01T08:18:58Z

@llvm/pr-subscribers-backend-nvptx

Author: Wenju He (wenju-he)

Changes

This addresses a performance issue for our downstream GPU target that
sets requiresStructuredCFG to true. The issue is that EarlyMachineLICM
pass does not hoist loop invariants because a critical edge is not split.
The critical edge's destination a loop header. Splitting the critical
edge will not break structured CFG.

Full diff: https://github.com/llvm/llvm-project/pull/154063.diff

3 Files Affected:

(modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+3-1)
(modified) llvm/lib/CodeGen/MachineBasicBlock.cpp (+12-4)
(added) llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir (+72)

diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index 9e3d9196cc184..7df34a76912dd 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -1035,7 +1035,9 @@ class MachineBasicBlock
   /// Succ, can be split. If this returns true a subsequent call to
   /// SplitCriticalEdge is guaranteed to return a valid basic block if
   /// no changes occurred in the meantime.
-  LLVM_ABI bool canSplitCriticalEdge(const MachineBasicBlock *Succ) const;
+  LLVM_ABI bool
+  canSplitCriticalEdge(const MachineBasicBlock *Succ,
+                       const MachineLoopInfo *MLI = nullptr) const;
 
   void pop_front() { Insts.pop_front(); }
   void pop_back() { Insts.pop_back(); }
diff --git a/llvm/lib/CodeGen/MachineBasicBlock.cpp b/llvm/lib/CodeGen/MachineBasicBlock.cpp
index c3c5a0f5102d7..8c795f812df09 100644
--- a/llvm/lib/CodeGen/MachineBasicBlock.cpp
+++ b/llvm/lib/CodeGen/MachineBasicBlock.cpp
@@ -1160,7 +1160,7 @@ MachineBasicBlock *MachineBasicBlock::SplitCriticalEdge(
 MachineBasicBlock *MachineBasicBlock::SplitCriticalEdge(
     MachineBasicBlock *Succ, const SplitCriticalEdgeAnalyses &Analyses,
     std::vector<SparseBitVector<>> *LiveInSets, MachineDomTreeUpdater *MDTU) {
-  if (!canSplitCriticalEdge(Succ))
+  if (!canSplitCriticalEdge(Succ, Analyses.MLI))
     return nullptr;
 
   MachineFunction *MF = getParent();
@@ -1388,8 +1388,8 @@ MachineBasicBlock *MachineBasicBlock::SplitCriticalEdge(
   return NMBB;
 }
 
-bool MachineBasicBlock::canSplitCriticalEdge(
-    const MachineBasicBlock *Succ) const {
+bool MachineBasicBlock::canSplitCriticalEdge(const MachineBasicBlock *Succ,
+                                             const MachineLoopInfo *MLI) const {
   // Splitting the critical edge to a landing pad block is non-trivial. Don't do
   // it in this generic function.
   if (Succ->isEHPad())
@@ -1403,7 +1403,15 @@ bool MachineBasicBlock::canSplitCriticalEdge(
   const MachineFunction *MF = getParent();
   // Performance might be harmed on HW that implements branching using exec mask
   // where both sides of the branches are always executed.
-  if (MF->getTarget().requiresStructuredCFG())
+  // However, if `Succ` is a loop header, splitting the critical edge will not
+  // break structured CFG.
+  auto SuccIsLoopHeader = [&]() {
+    if (MLI)
+      if (MachineLoop *L = MLI->getLoopFor(Succ); L && L->getHeader() == Succ)
+        return true;
+    return false;
+  };
+  if (MF->getTarget().requiresStructuredCFG() && !SuccIsLoopHeader())
     return false;
 
   // Do we have an Indirect jump with a jumptable that we can rewrite?
diff --git a/llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir b/llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir
new file mode 100644
index 0000000000000..f2f0ffdec8094
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir
@@ -0,0 +1,72 @@
+# RUN: llc -mtriple=nvptx64 -mcpu=sm_20 -run-pass=early-machinelicm %s -verify-machineinstrs -o - | FileCheck %s
+
+# This test checks that the early-machineLICM pass successfully creates a new
+# loop preheader by splitting the critical edge and hoisting the loop invariant
+# value `%18` to the preheader.
+# Since the critical edge successor is a loop header, the splitting does not
+# break the structured CFG, which is a requirement for the NVPTX target.
+
+---
+name:            test_hoist
+tracksRegLiveness: true
+registers:
+  - { id: 0, class: b64, preferred-register: '', flags: [  ] }
+  - { id: 1, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 2, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 3, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 4, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 5, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 6, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 7, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 8, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 9, class: b64, preferred-register: '', flags: [  ] }
+  - { id: 10, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 11, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 12, class: b32, preferred-register: '', flags: [  ] }
+  - { id: 13, class: b64, preferred-register: '', flags: [  ] }
+  - { id: 14, class: b64, preferred-register: '', flags: [  ] }
+  - { id: 15, class: b64, preferred-register: '', flags: [  ] }
+  - { id: 16, class: b1, preferred-register: '', flags: [  ] }
+  - { id: 17, class: b1, preferred-register: '', flags: [  ] }
+  - { id: 18, class: b32, preferred-register: '', flags: [  ] }
+body:             |
+  bb.0.entry:
+    successors: %bb.2(0x30000000), %bb.1(0x50000000)
+
+    %8:b32 = LD_i32 0, 0, 101, 3, 32, &test_hoist_param_2, 0 :: (dereferenceable invariant load (s32), addrspace 101)
+    %7:b32 = LD_i32 0, 0, 101, 3, 32, &test_hoist_param_1, 0 :: (dereferenceable invariant load (s32), addrspace 101)
+    %9:b64 = LD_i64 0, 0, 101, 3, 64, &test_hoist_param_0, 0 :: (dereferenceable invariant load (s64), addrspace 101)
+    %10:b32 = INT_PTX_SREG_CTAID_x
+    %11:b32 = INT_PTX_SREG_NTID_x
+    %12:b32 = INT_PTX_SREG_TID_x
+    %13:b64 = CVT_u64_u32 killed %12, 0
+    %14:b64 = nuw MAD_WIDE_U32rrr killed %11, killed %10, killed %13
+    %15:b64 = nuw nsw SHL64_ri killed %14, 2
+    %0:b64 = nuw ADD64rr killed %9, killed %15
+    %1:b32 = LD_i32 0, 0, 1, 3, 32, %0, 0
+    %16:b1 = SETP_i32ri %8, 0, 0
+    CBranch killed %16, %bb.2
+    GOTO %bb.1
+
+  ; CHECK: bb.3:
+  ; CHECK:   successors: %bb.1(0x80000000)
+  ; CHECK:   %18:b32 = ADD32ri %7, -1
+  ; CHECK: bb.1:
+
+  bb.1:
+    successors: %bb.2(0x04000000), %bb.1(0x7c000000)
+
+    %2:b32 = PHI %8, %bb.0, %5, %bb.1
+    %3:b32 = PHI %1, %bb.0, %4, %bb.1
+    %18:b32 = ADD32ri %7, -1
+    %4:b32 = SREM32rr %3, %18
+    %5:b32 = ADD32ri %2, -1
+    %17:b1 = SETP_i32ri %5, 0, 1
+    CBranch killed %17, %bb.1
+    GOTO %bb.2
+
+  bb.2:
+    %6:b32 = PHI %1, %bb.0, %4, %bb.1
+    ST_i32 %6, 0, 0, 1, 32, %0, 0
+    Return
+...

…resStructuredCFG-loop-header

arsenm · 2025-09-24T03:53:49Z

llvm/lib/CodeGen/MachineBasicBlock.cpp

+    if (MLI)
+      if (MachineLoop *L = MLI->getLoopFor(Succ); L && L->getHeader() == Succ)
+        return true;
+    return false;


Suggested change

if (MLI)

if (MachineLoop *L = MLI->getLoopFor(Succ); L && L->getHeader() == Succ)

return true;

return false;

if (MLI)

const MachineLoop *L = MLI->getLoopFor(Succ);

return L && L->getHeader() == Succ;

This also doesn't need to be a lambda

This also doesn't need to be a lambda

done, thanks

llvm/lib/CodeGen/MachineBasicBlock.cpp

llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir

Co-authored-by: Matt Arsenault <[email protected]>

github-actions · 2025-09-25T23:46:42Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvm-ci · 2025-09-26T09:43:35Z

LLVM Buildbot has detected a new failure on builder cross-project-tests-sie-ubuntu-dwarf5 running on doug-worker-1b while building llvm at step 6 "test-build-unified-tree-check-cross-project".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/163/builds/27168

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-cross-project) failure: test (failure)
******************** TEST 'cross-project-tests :: debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
clang++ -O0 -glldb -std=gnu++11 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp -o /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/Output/limit_steps_line_mismatch.cpp.tmp # RUN: at line 6
+ clang++ -O0 -glldb -std=gnu++11 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp -o /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/Output/limit_steps_line_mismatch.cpp.tmp
"/usr/bin/python3.10" "/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/dexter.py" test --fail-lt 1.0 -w -v --debugger lldb-dap --lldb-executable "/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap" --dap-message-log=-e --binary /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/Output/limit_steps_line_mismatch.cpp.tmp -- /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp | /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/FileCheck /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp # RUN: at line 7
+ /usr/bin/python3.10 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/dexter.py test --fail-lt 1.0 -w -v --debugger lldb-dap --lldb-executable /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap --dap-message-log=-e --binary /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/Output/limit_steps_line_mismatch.cpp.tmp -- /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp
+ /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/FileCheck /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/limit_steps/limit_steps_line_mismatch.cpp
note: Opening DAP server: /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap
-> {
  "type": "request",
  "command": "initialize",
  "arguments": {
    "clientID": "dexter",
    "adapterID": "lldb-dap",
    "pathFormat": "path",
    "linesStartAt1": true,
    "columnsStartAt1": true,
    "supportsVariableType": true,
    "supportsVariablePaging": true,
    "supportsRunInTerminalRequest": false
  },
  "seq": 1
}
<- {
  "body": {
    "$__lldb_version": "lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision 745e1e6ad5d40ff8f1553e62c48554a61611ee76)\n  clang revision 745e1e6ad5d40ff8f1553e62c48554a61611ee76\n  llvm revision 745e1e6ad5d40ff8f1553e62c48554a61611ee76",
    "completionTriggerCharacters": [
      ".",
      " ",
      "\t"
    ],
    "exceptionBreakpointFilters": [
      {
        "description": "C++ Catch",
        "filter": "cpp_catch",
        "label": "C++ Catch",
        "supportsCondition": true
      },
      {
        "description": "C++ Throw",
        "filter": "cpp_throw",
        "label": "C++ Throw",
        "supportsCondition": true
      },
      {
        "description": "Objective-C Catch",
        "filter": "objc_catch",
...

…if successor is loop header (llvm#154063) This addresses a performance issue for our downstream GPU target that sets requiresStructuredCFG to true. The issue is that EarlyMachineLICM pass does not hoist loop invariants because a critical edge is not split. The critical edge's destination a loop header. Splitting the critical edge will not break structured CFG. Add a nvptx test to demonstrate the issue since the target also requires structured CFG. --------- Co-authored-by: Matt Arsenault <[email protected]>

llvmbot added the llvm:codegen label Aug 18, 2025

wenju-he requested review from arsenm and topperc August 18, 2025 05:19

wenju-he added 2 commits September 1, 2025 08:19

pass MLI as new arg

2a19359

add nvptx test

f9217ab

llvmbot added the backend:NVPTX label Sep 1, 2025

wenju-he requested a review from AlexMaclean September 1, 2025 08:22

wenju-he added 2 commits September 24, 2025 05:07

Merge branch 'main' into MachineBasicBlock-canSplitCriticalEdge-requi…

f9e1c1a

…resStructuredCFG-loop-header

fix test

ced084c

arsenm reviewed Sep 24, 2025

View reviewed changes

wenju-he added 2 commits September 24, 2025 07:27

remove lambda

865d70f

simplify test

f41c259

wenju-he requested a review from arsenm September 24, 2025 05:47

arsenm approved these changes Sep 25, 2025

View reviewed changes

llvm/lib/CodeGen/MachineBasicBlock.cpp Outdated Show resolved Hide resolved

llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir Outdated Show resolved Hide resolved

wenju-he and others added 2 commits September 26, 2025 07:42

Apply suggestion from @arsenm

c71b27e

Co-authored-by: Matt Arsenault <[email protected]>

Update llvm/test/CodeGen/NVPTX/machinelicm-no-preheader.mir

1db818c

Co-authored-by: Matt Arsenault <[email protected]>

clang-format

ec32c8b

wenju-he merged commit 745e1e6 into llvm:main Sep 26, 2025
7 of 9 checks passed

wenju-he deleted the MachineBasicBlock-canSplitCriticalEdge-requiresStructuredCFG-loop-header branch September 26, 2025 09:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CodeGen] Ignore requiresStructuredCFG check in canSplitCriticalEdge if successor is loop header #154063

[CodeGen] Ignore requiresStructuredCFG check in canSplitCriticalEdge if successor is loop header #154063

Uh oh!

wenju-he commented Aug 18, 2025 •

edited

Loading

Uh oh!

llvmbot commented Sep 1, 2025

Uh oh!

arsenm Sep 24, 2025

Uh oh!

wenju-he Sep 24, 2025

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

llvm-ci commented Sep 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CodeGen] Ignore requiresStructuredCFG check in canSplitCriticalEdge if successor is loop header #154063

[CodeGen] Ignore requiresStructuredCFG check in canSplitCriticalEdge if successor is loop header #154063

Uh oh!

Conversation

wenju-he commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 1, 2025

Uh oh!

arsenm Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

wenju-he Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

llvm-ci commented Sep 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wenju-he commented Aug 18, 2025 •

edited

Loading

github-actions bot commented Sep 25, 2025 •

edited

Loading