Skip to content

Conversation

@michaelselehov
Copy link
Contributor

Loop headers frequently consume the loop-carried value in the header block via non-lookthrough ops (e.g. byte-wise vector binops). LiveRegOptimizer’s same-BB filter currently prunes these users, so the loop-carried PHI is not coerced to i32 and the intended packed form is lost.

Relax the filter: when the def is a PHI, allow same-BB non-lookthrough users. Also fix the check to look at the user (CII) rather than the def (II) so the walk does not terminate prematurely.

@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: None (michaelselehov)

Changes

Loop headers frequently consume the loop-carried value in the header block via non-lookthrough ops (e.g. byte-wise vector binops). LiveRegOptimizer’s same-BB filter currently prunes these users, so the loop-carried PHI is not coerced to i32 and the intended packed form is lost.

Relax the filter: when the def is a PHI, allow same-BB non-lookthrough users. Also fix the check to look at the user (CII) rather than the def (II) so the walk does not terminate prematurely.


Full diff: https://github.com/llvm/llvm-project/pull/160909.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp (+4-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
index 38718c43a61dd..7504f1a8cea09 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
@@ -150,7 +150,10 @@ class LiveRegOptimizer {
       if (!CVisited.insert(CII).second)
         continue;
 
-      if (CII->getParent() == II->getParent() && !IsLookThru(II))
+      // Same-BB filter must look at the *user*; and allow non-lookthrough
+      // users when the def is a PHI (loop-header pattern).
+      if (CII->getParent() == II->getParent() && !IsLookThru(CII) &&
+          !isa<PHINode>(II))
         continue;
 
       if (isOpLegal(CII))

@arsenm arsenm requested a review from jrbyrnes September 26, 2025 15:35
@@ -150,7 +150,10 @@ class LiveRegOptimizer {
if (!CVisited.insert(CII).second)
continue;

if (CII->getParent() == II->getParent() && !IsLookThru(II))
// Same-BB filter must look at the *user*; and allow non-lookthrough
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a minimal IR test (lro-phi-samebb-nonlookthrough-store.ll) that checks the same-BB filter looks at the user and allows non-lookthrough when the def is a PHI

@michaelselehov michaelselehov force-pushed the amdgpu-lro-phi-same-bb-fix branch from fadbeb9 to 70d9547 Compare September 29, 2025 10:24
Loop headers frequently consume the loop-carried value in the header
block via non-lookthrough ops (e.g. byte-wise vector binops). LRO’s
same-BB filter currently prunes these users, so the loop-carried PHI
is not coerced to i32 and the intended packed form is lost.

Relax the filter: when the def is a PHI, allow same-BB non-lookthrough
users. Also fix the check to look at the user (CII) rather than the
def (II) so the walk does not terminate prematurely.
@michaelselehov michaelselehov force-pushed the amdgpu-lro-phi-same-bb-fix branch from 70d9547 to 08d6d8e Compare September 29, 2025 12:40
@arsenm arsenm merged commit 617854f into llvm:main Sep 29, 2025
9 checks passed
@michaelselehov michaelselehov deleted the amdgpu-lro-phi-same-bb-fix branch September 29, 2025 16:22
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
Loop headers frequently consume the loop-carried value in the header
block via non-lookthrough ops (e.g. byte-wise vector binops).
LiveRegOptimizer’s same-BB filter currently prunes these users, so the
loop-carried PHI is not coerced to i32 and the intended packed form is
lost.

Relax the filter: when the def is a PHI, allow same-BB non-lookthrough
users. Also fix the check to look at the user (CII) rather than the def
(II) so the walk does not terminate prematurely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants