Skip to content

Conversation

vpykhtin
Copy link
Contributor

@vpykhtin vpykhtin commented Oct 2, 2025

Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block.

Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations.

Using SlotIndexes::getMBBEndIdx() isn't the correct choice here
because it returns the starting index of the next basic block,
causing live-ins of the next block to be calculated instead of
the intended live-outs of the current block.

Add SlotIndexes::getMBBLastIdx() method to return the last valid
SlotIndex within a basic block, enabling correct live-out calculations.
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug in AMDGPU register pressure tracking where live-out register sets were incorrectly calculated. The issue was caused by using getMBBEndIdx() which returns the start index of the next block instead of the last valid index within the current block.

Key changes:

  • Added getMBBLastIdx() method to return the last valid SlotIndex within a basic block
  • Updated register pressure tracking code to use the correct SlotIndex for live-out calculations
  • Added validation for invalid SlotIndex cases when basic blocks have no non-debug instructions

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
llvm/include/llvm/CodeGen/SlotIndexes.h Added getMBBLastIdx() method with proper documentation
llvm/lib/Target/AMDGPU/GCNRegPressure.h Updated reset() method to use getMBBLastIdx() instead of getMBBEndIdx()
llvm/lib/Target/AMDGPU/GCNRegPressure.cpp Updated live-out calculation logic and added validation for invalid SlotIndex
llvm/test/CodeGen/AMDGPU/regpressure_printer.mir Updated test expectations to reflect corrected live-out calculations

@llvmbot
Copy link
Member

llvmbot commented Oct 2, 2025

@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-amdgpu

Author: Valery Pykhtin (vpykhtin)

Changes

Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block.

Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations.


Full diff: https://github.com/llvm/llvm-project/pull/161720.diff

4 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/SlotIndexes.h (+15-2)
  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.cpp (+20-19)
  • (modified) llvm/lib/Target/AMDGPU/GCNRegPressure.h (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/regpressure_printer.mir (+8-10)
diff --git a/llvm/include/llvm/CodeGen/SlotIndexes.h b/llvm/include/llvm/CodeGen/SlotIndexes.h
index 1e270b4d9d302..c81c843522b34 100644
--- a/llvm/include/llvm/CodeGen/SlotIndexes.h
+++ b/llvm/include/llvm/CodeGen/SlotIndexes.h
@@ -467,16 +467,29 @@ class raw_ostream;
       return getMBBRange(mbb).first;
     }
 
-    /// Returns the last index in the given basic block number.
+    /// Returns the index past the last valid index in the given basic block.
     SlotIndex getMBBEndIdx(unsigned Num) const {
       return getMBBRange(Num).second;
     }
 
-    /// Returns the last index in the given basic block.
+    /// Returns the index past the last valid index in the given basic block.
     SlotIndex getMBBEndIdx(const MachineBasicBlock *mbb) const {
       return getMBBRange(mbb).second;
     }
 
+    /// Returns the last valid index in the given basic block.
+    /// This index corresponds to the dead slot of the last non-debug
+    /// instruction and can be used to find live-out ranges of the block. Note
+    /// that getMBBEndIdx returns the start index of the next block, which is
+    /// also used as the start index for segments with phi-def values. Returns
+    /// an invalid SlotIndex if the block has no non-debug instructions.
+    SlotIndex getMBBLastIdx(const MachineBasicBlock *MBB) const {
+      auto LastInstrI = MBB->getLastNonDebugInstr();
+      return LastInstrI == MBB->end()
+                 ? SlotIndex()
+                 : getInstructionIndex(*LastInstrI).getDeadSlot();
+    }
+
     /// Iterator over the idx2MBBMap (sorted pairs of slot index of basic block
     /// begin and basic block)
     using MBBIndexIterator = SmallVectorImpl<IdxMBBPair>::const_iterator;
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
index ef63acc6355d2..d95fdec1fbef9 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
@@ -461,6 +461,8 @@ GCNRPTracker::LiveRegSet llvm::getLiveRegs(SlotIndex SI,
                                            const LiveIntervals &LIS,
                                            const MachineRegisterInfo &MRI) {
   GCNRPTracker::LiveRegSet LiveRegs;
+  if (!SI.isValid())
+    return LiveRegs;
   for (unsigned I = 0, E = MRI.getNumVirtRegs(); I != E; ++I) {
     auto Reg = Register::index2VirtReg(I);
     if (!LIS.hasInterval(Reg))
@@ -906,32 +908,31 @@ bool GCNRegPressurePrinter::runOnMachineFunction(MachineFunction &MF) {
 
     SlotIndex MBBStartSlot = LIS.getSlotIndexes()->getMBBStartIdx(&MBB);
     SlotIndex MBBEndSlot = LIS.getSlotIndexes()->getMBBEndIdx(&MBB);
+    SlotIndex MBBLastSlot = LIS.getSlotIndexes()->getMBBLastIdx(&MBB);
 
     GCNRPTracker::LiveRegSet LiveIn, LiveOut;
     GCNRegPressure RPAtMBBEnd;
 
-    if (UseDownwardTracker) {
-      if (MBB.empty()) {
-        LiveIn = LiveOut = getLiveRegs(MBBStartSlot, LIS, MRI);
-        RPAtMBBEnd = getRegPressure(MRI, LiveIn);
-      } else {
-        GCNDownwardRPTracker RPT(LIS);
-        RPT.reset(MBB.front());
+    if (!MBBLastSlot.isValid()) { // MBB doesn't have any non-debug instrs.
+      LiveIn = LiveOut = getLiveRegs(MBBStartSlot, LIS, MRI);
+      RPAtMBBEnd = getRegPressure(MRI, LiveIn);
+    } else if (UseDownwardTracker) {
+      GCNDownwardRPTracker RPT(LIS);
+      RPT.reset(MBB.front());
 
-        LiveIn = RPT.getLiveRegs();
-
-        while (!RPT.advanceBeforeNext()) {
-          GCNRegPressure RPBeforeMI = RPT.getPressure();
-          RPT.advanceToNext();
-          RP.emplace_back(RPBeforeMI, RPT.getPressure());
-        }
+      LiveIn = RPT.getLiveRegs();
 
-        LiveOut = RPT.getLiveRegs();
-        RPAtMBBEnd = RPT.getPressure();
+      while (!RPT.advanceBeforeNext()) {
+        GCNRegPressure RPBeforeMI = RPT.getPressure();
+        RPT.advanceToNext();
+        RP.emplace_back(RPBeforeMI, RPT.getPressure());
       }
+
+      LiveOut = RPT.getLiveRegs();
+      RPAtMBBEnd = RPT.getPressure();
     } else {
       GCNUpwardRPTracker RPT(LIS);
-      RPT.reset(MRI, MBBEndSlot);
+      RPT.reset(MRI, MBBLastSlot);
 
       LiveOut = RPT.getLiveRegs();
       RPAtMBBEnd = RPT.getPressure();
@@ -965,8 +966,8 @@ bool GCNRegPressurePrinter::runOnMachineFunction(MachineFunction &MF) {
     OS << printRP(RPAtMBBEnd) << '\n';
 
     OS << PFX "  Live-out:" << llvm::print(LiveOut, MRI);
-    if (UseDownwardTracker)
-      ReportLISMismatchIfAny(LiveOut, getLiveRegs(MBBEndSlot, LIS, MRI));
+    if (UseDownwardTracker && MBBLastSlot.isValid())
+      ReportLISMismatchIfAny(LiveOut, getLiveRegs(MBBLastSlot, LIS, MRI));
 
     GCNRPTracker::LiveRegSet LiveThrough;
     for (auto [Reg, Mask] : LiveIn) {
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.h b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
index a9c58bb90ef03..898d1ffc10b79 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.h
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
@@ -313,8 +313,8 @@ class GCNUpwardRPTracker : public GCNRPTracker {
 
   /// reset tracker to the end of the \p MBB.
   void reset(const MachineBasicBlock &MBB) {
-    reset(MBB.getParent()->getRegInfo(),
-          LIS.getSlotIndexes()->getMBBEndIdx(&MBB));
+    SlotIndex MBBLastSlot = LIS.getSlotIndexes()->getMBBLastIdx(&MBB);
+    reset(MBB.getParent()->getRegInfo(), MBBLastSlot);
   }
 
   /// reset tracker to the point just after \p MI (in program order).
diff --git a/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir b/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
index 8d5b5e4e8cae1..b41aa088bfdde 100644
--- a/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
+++ b/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
@@ -510,14 +510,14 @@ body: |
   ; RPU-NEXT:   0     0      $sgpr0 = S_BUFFER_LOAD_DWORD_IMM $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0
   ; RPU-NEXT:   0     0
   ; RPU-NEXT:   0     1      undef %0.sub5:vreg_512 = V_MOV_B32_e32 5, implicit $exec
-  ; RPU-NEXT:   0     0
-  ; RPU-NEXT:   0     0      S_CMP_GT_U32 $sgpr0, 15, implicit-def $scc
-  ; RPU-NEXT:   0     0
-  ; RPU-NEXT:   0     0      S_CBRANCH_SCC1 %bb.2, implicit $scc
-  ; RPU-NEXT:   0     0
-  ; RPU-NEXT:   0     0      S_BRANCH %bb.1
-  ; RPU-NEXT:   0     0
-  ; RPU-NEXT:   Live-out:
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   0     1      S_CMP_GT_U32 $sgpr0, 15, implicit-def $scc
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   0     1      S_CBRANCH_SCC1 %bb.2, implicit $scc
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   0     1      S_BRANCH %bb.1
+  ; RPU-NEXT:   0     1
+  ; RPU-NEXT:   Live-out: %0:0000000000000C00
   ; RPU-NEXT:   Live-thr:
   ; RPU-NEXT:   0     0
   ; RPU-NEXT: bb.1:
@@ -571,8 +571,6 @@ body: |
   ; RPD-NEXT:   0     1      S_BRANCH %bb.1
   ; RPD-NEXT:   0     1
   ; RPD-NEXT:   Live-out: %0:0000000000000C00
-  ; RPD-NEXT:   mis LIS:
-  ; RPD-NEXT:     %0:L0000000000000C00 isn't found in LIS reported set
   ; RPD-NEXT:   Live-thr:
   ; RPD-NEXT:   0     0
   ; RPD-NEXT: bb.1:

SlotIndex MBBStartSlot = LIS.getSlotIndexes()->getMBBStartIdx(&MBB);
SlotIndex MBBEndSlot = LIS.getSlotIndexes()->getMBBEndIdx(&MBB);
SlotIndex MBBLastSlot = LIS.getSlotIndexes()->getMBBLastIdx(&MBB);
bool MBBHasNonDebugInstrs = MBBStartSlot != MBBLastSlot.getBaseIndex();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the name, this doesn't prove there are no debug instructions in the block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug instructions doesn't have slot index. If the basic block is empty or contains only debug instructions MBBEndIdx(MBB).getPrevIndex() will return slot index equal to MBBStartIdx(MBB)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant, I removed this code.

@vpykhtin vpykhtin merged commit 5eef98b into llvm:main Oct 10, 2025
9 checks passed
DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025
…lvm#161720)

Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because
it returns the starting index of the next basic block, causing live-ins
of the next block to be calculated instead of the intended live-outs of
the current block.

Add SlotIndexes::getMBBLastIdx() method to return the last valid
SlotIndex within a basic block, enabling correct live-out calculations.
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
…lvm#161720)

Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because
it returns the starting index of the next basic block, causing live-ins
of the next block to be calculated instead of the intended live-outs of
the current block.

Add SlotIndexes::getMBBLastIdx() method to return the last valid
SlotIndex within a basic block, enabling correct live-out calculations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants