-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[AMDGPU] Use correct SlotIndex to calculate live-out register set. #161720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block. Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a bug in AMDGPU register pressure tracking where live-out register sets were incorrectly calculated. The issue was caused by using getMBBEndIdx()
which returns the start index of the next block instead of the last valid index within the current block.
Key changes:
- Added
getMBBLastIdx()
method to return the last valid SlotIndex within a basic block - Updated register pressure tracking code to use the correct SlotIndex for live-out calculations
- Added validation for invalid SlotIndex cases when basic blocks have no non-debug instructions
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
llvm/include/llvm/CodeGen/SlotIndexes.h | Added getMBBLastIdx() method with proper documentation |
llvm/lib/Target/AMDGPU/GCNRegPressure.h | Updated reset() method to use getMBBLastIdx() instead of getMBBEndIdx() |
llvm/lib/Target/AMDGPU/GCNRegPressure.cpp | Updated live-out calculation logic and added validation for invalid SlotIndex |
llvm/test/CodeGen/AMDGPU/regpressure_printer.mir | Updated test expectations to reflect corrected live-out calculations |
@llvm/pr-subscribers-llvm-regalloc @llvm/pr-subscribers-backend-amdgpu Author: Valery Pykhtin (vpykhtin) ChangesUsing SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block. Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations. Full diff: https://github.com/llvm/llvm-project/pull/161720.diff 4 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/SlotIndexes.h b/llvm/include/llvm/CodeGen/SlotIndexes.h
index 1e270b4d9d302..c81c843522b34 100644
--- a/llvm/include/llvm/CodeGen/SlotIndexes.h
+++ b/llvm/include/llvm/CodeGen/SlotIndexes.h
@@ -467,16 +467,29 @@ class raw_ostream;
return getMBBRange(mbb).first;
}
- /// Returns the last index in the given basic block number.
+ /// Returns the index past the last valid index in the given basic block.
SlotIndex getMBBEndIdx(unsigned Num) const {
return getMBBRange(Num).second;
}
- /// Returns the last index in the given basic block.
+ /// Returns the index past the last valid index in the given basic block.
SlotIndex getMBBEndIdx(const MachineBasicBlock *mbb) const {
return getMBBRange(mbb).second;
}
+ /// Returns the last valid index in the given basic block.
+ /// This index corresponds to the dead slot of the last non-debug
+ /// instruction and can be used to find live-out ranges of the block. Note
+ /// that getMBBEndIdx returns the start index of the next block, which is
+ /// also used as the start index for segments with phi-def values. Returns
+ /// an invalid SlotIndex if the block has no non-debug instructions.
+ SlotIndex getMBBLastIdx(const MachineBasicBlock *MBB) const {
+ auto LastInstrI = MBB->getLastNonDebugInstr();
+ return LastInstrI == MBB->end()
+ ? SlotIndex()
+ : getInstructionIndex(*LastInstrI).getDeadSlot();
+ }
+
/// Iterator over the idx2MBBMap (sorted pairs of slot index of basic block
/// begin and basic block)
using MBBIndexIterator = SmallVectorImpl<IdxMBBPair>::const_iterator;
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
index ef63acc6355d2..d95fdec1fbef9 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
@@ -461,6 +461,8 @@ GCNRPTracker::LiveRegSet llvm::getLiveRegs(SlotIndex SI,
const LiveIntervals &LIS,
const MachineRegisterInfo &MRI) {
GCNRPTracker::LiveRegSet LiveRegs;
+ if (!SI.isValid())
+ return LiveRegs;
for (unsigned I = 0, E = MRI.getNumVirtRegs(); I != E; ++I) {
auto Reg = Register::index2VirtReg(I);
if (!LIS.hasInterval(Reg))
@@ -906,32 +908,31 @@ bool GCNRegPressurePrinter::runOnMachineFunction(MachineFunction &MF) {
SlotIndex MBBStartSlot = LIS.getSlotIndexes()->getMBBStartIdx(&MBB);
SlotIndex MBBEndSlot = LIS.getSlotIndexes()->getMBBEndIdx(&MBB);
+ SlotIndex MBBLastSlot = LIS.getSlotIndexes()->getMBBLastIdx(&MBB);
GCNRPTracker::LiveRegSet LiveIn, LiveOut;
GCNRegPressure RPAtMBBEnd;
- if (UseDownwardTracker) {
- if (MBB.empty()) {
- LiveIn = LiveOut = getLiveRegs(MBBStartSlot, LIS, MRI);
- RPAtMBBEnd = getRegPressure(MRI, LiveIn);
- } else {
- GCNDownwardRPTracker RPT(LIS);
- RPT.reset(MBB.front());
+ if (!MBBLastSlot.isValid()) { // MBB doesn't have any non-debug instrs.
+ LiveIn = LiveOut = getLiveRegs(MBBStartSlot, LIS, MRI);
+ RPAtMBBEnd = getRegPressure(MRI, LiveIn);
+ } else if (UseDownwardTracker) {
+ GCNDownwardRPTracker RPT(LIS);
+ RPT.reset(MBB.front());
- LiveIn = RPT.getLiveRegs();
-
- while (!RPT.advanceBeforeNext()) {
- GCNRegPressure RPBeforeMI = RPT.getPressure();
- RPT.advanceToNext();
- RP.emplace_back(RPBeforeMI, RPT.getPressure());
- }
+ LiveIn = RPT.getLiveRegs();
- LiveOut = RPT.getLiveRegs();
- RPAtMBBEnd = RPT.getPressure();
+ while (!RPT.advanceBeforeNext()) {
+ GCNRegPressure RPBeforeMI = RPT.getPressure();
+ RPT.advanceToNext();
+ RP.emplace_back(RPBeforeMI, RPT.getPressure());
}
+
+ LiveOut = RPT.getLiveRegs();
+ RPAtMBBEnd = RPT.getPressure();
} else {
GCNUpwardRPTracker RPT(LIS);
- RPT.reset(MRI, MBBEndSlot);
+ RPT.reset(MRI, MBBLastSlot);
LiveOut = RPT.getLiveRegs();
RPAtMBBEnd = RPT.getPressure();
@@ -965,8 +966,8 @@ bool GCNRegPressurePrinter::runOnMachineFunction(MachineFunction &MF) {
OS << printRP(RPAtMBBEnd) << '\n';
OS << PFX " Live-out:" << llvm::print(LiveOut, MRI);
- if (UseDownwardTracker)
- ReportLISMismatchIfAny(LiveOut, getLiveRegs(MBBEndSlot, LIS, MRI));
+ if (UseDownwardTracker && MBBLastSlot.isValid())
+ ReportLISMismatchIfAny(LiveOut, getLiveRegs(MBBLastSlot, LIS, MRI));
GCNRPTracker::LiveRegSet LiveThrough;
for (auto [Reg, Mask] : LiveIn) {
diff --git a/llvm/lib/Target/AMDGPU/GCNRegPressure.h b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
index a9c58bb90ef03..898d1ffc10b79 100644
--- a/llvm/lib/Target/AMDGPU/GCNRegPressure.h
+++ b/llvm/lib/Target/AMDGPU/GCNRegPressure.h
@@ -313,8 +313,8 @@ class GCNUpwardRPTracker : public GCNRPTracker {
/// reset tracker to the end of the \p MBB.
void reset(const MachineBasicBlock &MBB) {
- reset(MBB.getParent()->getRegInfo(),
- LIS.getSlotIndexes()->getMBBEndIdx(&MBB));
+ SlotIndex MBBLastSlot = LIS.getSlotIndexes()->getMBBLastIdx(&MBB);
+ reset(MBB.getParent()->getRegInfo(), MBBLastSlot);
}
/// reset tracker to the point just after \p MI (in program order).
diff --git a/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir b/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
index 8d5b5e4e8cae1..b41aa088bfdde 100644
--- a/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
+++ b/llvm/test/CodeGen/AMDGPU/regpressure_printer.mir
@@ -510,14 +510,14 @@ body: |
; RPU-NEXT: 0 0 $sgpr0 = S_BUFFER_LOAD_DWORD_IMM $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0
; RPU-NEXT: 0 0
; RPU-NEXT: 0 1 undef %0.sub5:vreg_512 = V_MOV_B32_e32 5, implicit $exec
- ; RPU-NEXT: 0 0
- ; RPU-NEXT: 0 0 S_CMP_GT_U32 $sgpr0, 15, implicit-def $scc
- ; RPU-NEXT: 0 0
- ; RPU-NEXT: 0 0 S_CBRANCH_SCC1 %bb.2, implicit $scc
- ; RPU-NEXT: 0 0
- ; RPU-NEXT: 0 0 S_BRANCH %bb.1
- ; RPU-NEXT: 0 0
- ; RPU-NEXT: Live-out:
+ ; RPU-NEXT: 0 1
+ ; RPU-NEXT: 0 1 S_CMP_GT_U32 $sgpr0, 15, implicit-def $scc
+ ; RPU-NEXT: 0 1
+ ; RPU-NEXT: 0 1 S_CBRANCH_SCC1 %bb.2, implicit $scc
+ ; RPU-NEXT: 0 1
+ ; RPU-NEXT: 0 1 S_BRANCH %bb.1
+ ; RPU-NEXT: 0 1
+ ; RPU-NEXT: Live-out: %0:0000000000000C00
; RPU-NEXT: Live-thr:
; RPU-NEXT: 0 0
; RPU-NEXT: bb.1:
@@ -571,8 +571,6 @@ body: |
; RPD-NEXT: 0 1 S_BRANCH %bb.1
; RPD-NEXT: 0 1
; RPD-NEXT: Live-out: %0:0000000000000C00
- ; RPD-NEXT: mis LIS:
- ; RPD-NEXT: %0:L0000000000000C00 isn't found in LIS reported set
; RPD-NEXT: Live-thr:
; RPD-NEXT: 0 0
; RPD-NEXT: bb.1:
|
SlotIndex MBBStartSlot = LIS.getSlotIndexes()->getMBBStartIdx(&MBB); | ||
SlotIndex MBBEndSlot = LIS.getSlotIndexes()->getMBBEndIdx(&MBB); | ||
SlotIndex MBBLastSlot = LIS.getSlotIndexes()->getMBBLastIdx(&MBB); | ||
bool MBBHasNonDebugInstrs = MBBStartSlot != MBBLastSlot.getBaseIndex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the name, this doesn't prove there are no debug instructions in the block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug instructions doesn't have slot index. If the basic block is empty or contains only debug instructions MBBEndIdx(MBB).getPrevIndex() will return slot index equal to MBBStartIdx(MBB)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems redundant, I removed this code.
…lvm#161720) Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block. Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations.
…lvm#161720) Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block. Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations.
Using SlotIndexes::getMBBEndIdx() isn't the correct choice here because it returns the starting index of the next basic block, causing live-ins of the next block to be calculated instead of the intended live-outs of the current block.
Add SlotIndexes::getMBBLastIdx() method to return the last valid SlotIndex within a basic block, enabling correct live-out calculations.