Skip to content

Conversation

@jgu222
Copy link
Contributor

@jgu222 jgu222 commented Oct 15, 2025

Add PHI reuse hook so that a target can decide if temporary register during PHI node elimination can be reused. By default, two phis are allowed to use the same temporary register if their right-hand-sides are the same.

However, the left-hand sides of phis need to be checked as well for some targets, such as a GPU target. If the register banks of phis's left-hand sides are diffrerent, reuse is not allowed. Thus, this change adds a target hook for this kind of checking as it is target-dependent.

This change has no functional change.

This is to resolve the issue:
#163500

@llvmbot
Copy link
Member

llvmbot commented Oct 15, 2025

@llvm/pr-subscribers-llvm-regalloc

Author: Junjie Gu (jgu222)

Changes

Add PHI reuse hook so that a target can decide if temporary register during PHI node elimination can be reused. By default, two phis are allowed to use the same temporary register if their right-hand-sides are the same.

However, the left-hand sides of phis need to be checked as well for some targets, such as a GPU target. If the register banks of phis's left-hand sides are diffrerent, reuse is not allowed. Thus, this change adds a target hook for this kind of checking as it is target-dependent.

This change has no functional change.

This is to resolve the issue:
#163500


Full diff: https://github.com/llvm/llvm-project/pull/163604.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+13-2)
  • (modified) llvm/lib/CodeGen/PHIElimination.cpp (+1-1)
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 175f205328361..041958b7a3649 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2158,7 +2158,7 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
     return TargetOpcode::COPY;
   }
 
-  /// During PHI eleimination lets target to make necessary checks and
+  /// During PHI elimination lets target to make necessary checks and
   /// insert the copy to the PHI destination register in a target specific
   /// manner.
   virtual MachineInstr *createPHIDestinationCopy(
@@ -2168,7 +2168,7 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
         .addReg(Src);
   }
 
-  /// During PHI eleimination lets target to make necessary checks and
+  /// During PHI elimination lets target to make necessary checks and
   /// insert the copy to the PHI destination register in a target specific
   /// manner.
   virtual MachineInstr *createPHISourceCopy(MachineBasicBlock &MBB,
@@ -2180,6 +2180,17 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
         .addReg(Src, 0, SrcSubReg);
   }
 
+  /// During PHI elimination lets target to decide if two phis can use the
+  /// same register \p Reg when they have the same rhs. Register \p Reg has
+  /// been used for the first phi and \p PHIReg is the DestReg of the second
+  /// Phi. This function is to check if the second phi can reuse \p Reg as
+  /// its temporary register.
+  /// The default is to allow reuse.
+  virtual bool allowPHIReuse(Register Reg, Register PHIReg,
+                             const MachineFunction &MF) const {
+    return true;
+  }
+
   /// Returns a \p outliner::OutlinedFunction struct containing target-specific
   /// information for a set of outlining candidates. Returns std::nullopt if the
   /// candidates are not suitable for outlining. \p MinRepeats is the minimum
diff --git a/llvm/lib/CodeGen/PHIElimination.cpp b/llvm/lib/CodeGen/PHIElimination.cpp
index 34a9d5d0e401f..65423bd686eff 100644
--- a/llvm/lib/CodeGen/PHIElimination.cpp
+++ b/llvm/lib/CodeGen/PHIElimination.cpp
@@ -380,7 +380,7 @@ void PHIEliminationImpl::LowerPHINode(MachineBasicBlock &MBB,
     Register *Entry = nullptr;
     if (AllEdgesCritical)
       Entry = &LoweredPHIs[MPhi];
-    if (Entry && *Entry) {
+    if (Entry && *Entry && TII->allowPHIReuse(*Entry, MPhi, MF)) {
       // An identical PHI node was already lowered. Reuse the incoming register.
       IncomingReg = *Entry;
       reusedIncoming = true;

Add PHI reuse hook so that a target can decide if temporary register
during PHI node elimination can be reused. By default, two phis are
allowed to use the same temporary register if their right-hand-sides
are the same.

However, the left-hand sides of phis need to be checked as well
for some targets, such as a GPU target. If the register banks of
phis's left-hand sides are diffrerent, reuse is not allowed.
Thus, this change adds a target hook for this kind of checking
as it is target-dependent.

This change has no functional change.

This is to resolve the issue:
  llvm#163500
@jgu222
Copy link
Contributor Author

jgu222 commented Oct 15, 2025

@paperchalice @guy-david Could you help review this ?

@jgu222
Copy link
Contributor Author

jgu222 commented Oct 15, 2025

@arsenm Could you help review it ?

@guy-david
Copy link
Contributor

guy-david commented Oct 16, 2025

I don't think this is the correct approach, but correct me if I'm mistaken.
The underlying assumption of reusing the same virtual destination register is that in the worst case, values will be copied around (different register classes/banks, allocation). In the presented issue, it seems like there's an actual semantic difference between what the virtual registers represent in each block, depending on which thread is executing? This sounds more like a modeling issue rather than a problem with this specific pass.
Maybe verifying that the register classes overlap can be a workaround in this specific case, but it seems like more issues will pop along the way.

@jgu222
Copy link
Contributor Author

jgu222 commented Oct 16, 2025

I don't think this is the correct approach, but correct me if I'm mistaken. The underlying assumption of reusing the same virtual destination register is that in the worst case, values will be copied around (different register classes/banks, allocation). In the presented issue, it seems like there's an actual semantic difference between what the virtual registers represent in each block, depending on which threads is executing? This sounds more like a modeling issue rather than a problem with this specific pass. Maybe verifying that the register classes overlap can be a workaround in this specific case, but it seems like more issues will pop along the way.

Thought about using register class to check if reuse is okay, but give up as it is really up to a target. It is possible that two rhs-identical PHIs have their lhs in different register class, and those two different rc could be interchangeable in one target, but not allowed in another.

The correctness here is up to how convergence is enforced for a gpu target. When favoring backedge for a loop, this exposes this issue. I feel this is better to be decided by a target.

@jgu222
Copy link
Contributor Author

jgu222 commented Oct 24, 2025

This sounds more like a modeling issue rather than a problem with this specific pass.
Not sure about modeling. But if so, what would be a solution ? Could you elaborate on that ?

To me, the phi node elimination assumes that two phis that have identical right-hand sides can use the same virtual temp. This is true for cpu, but not true for gpu due to gpu code's complicate divergence/convergence scheme.

If not this one, what approach is more appropriate ? Thanks

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too specific and has no test coverage. There are existing methods for checking if registers share a register bank or class

@arsenm
Copy link
Contributor

arsenm commented Oct 24, 2025

To me, the phi node elimination assumes that two phis that have identical right-hand sides can use the same virtual temp. This is true for cpu, but not true for gpu due to gpu code's complicate divergence/convergence scheme.

This is a modeling issue

@jgu222
Copy link
Contributor Author

jgu222 commented Oct 24, 2025

This is too specific and has no test coverage. There are existing methods for checking if registers share a register bank or class

Kind of agree that this is too specific. Using register class works for me, although I feel this checking is a stronger condition. But I can live with it.

How is the following change ?

diff --git a/llvm/lib/CodeGen/PHIElimination.cpp b/llvm/lib/CodeGen/PHIElimination.cpp
index 34a9d5d0e401..5f08aa120414 100644
--- a/llvm/lib/CodeGen/PHIElimination.cpp
+++ b/llvm/lib/CodeGen/PHIElimination.cpp
@@ -377,10 +377,17 @@ void PHIEliminationImpl::LowerPHINode(MachineBasicBlock &MBB,
     // typically those created by tail duplication. Typically, an identical PHI
     // node can't occur, so avoid hashing/storing such PHIs, which is somewhat
     // expensive.
+    auto canReuse = [&MF](Register Reg0, Register Reg1) {
+      auto &MRI = MF.getRegInfo();
+      auto *RC0 = MRI.getRegClass(Reg0);
+      auto *RC1 = MRI.getRegClass(Reg1);
+      return RC0 && RC1 && RC0 == RC1;
+    };
+
     Register *Entry = nullptr;
     if (AllEdgesCritical)
       Entry = &LoweredPHIs[MPhi];
-    if (Entry && *Entry) {
+    if (Entry && *Entry && canReuse(*Entry, DestReg)) {
       // An identical PHI node was already lowered. Reuse the incoming register.
       IncomingReg = *Entry;
       reusedIncoming = true;

@arsenm
Copy link
Contributor

arsenm commented Oct 24, 2025

Is this the problem solved by SIOptimizeVGPRLiveRange?

You generally want to work in terms of common subclasses than exact class checks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants