Skip to content

Commit 3b98beb

Browse files
[AMDGPU] LRO: allow same-BB non-lookthrough users for PHI
Loop headers frequently consume the loop-carried value in the header block via non-lookthrough ops (e.g. byte-wise vector binops). LRO’s same-BB filter currently prunes these users, so the loop-carried PHI is not coerced to i32 and the intended packed form is lost. Relax the filter: when the def is a PHI, allow same-BB non-lookthrough users. Also fix the check to look at the user (CII) rather than the def (II) so the walk does not terminate prematurely.
1 parent 2d1f9c9 commit 3b98beb

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,10 @@ class LiveRegOptimizer {
150150
if (!CVisited.insert(CII).second)
151151
continue;
152152

153-
if (CII->getParent() == II->getParent() && !IsLookThru(II))
153+
// Same-BB filter must look at the *user*; and allow non-lookthrough
154+
// users when the def is a PHI (loop-header pattern).
155+
if (CII->getParent() == II->getParent() && !IsLookThru(CII) &&
156+
!isa<PHINode>(II))
154157
continue;
155158

156159
if (isOpLegal(CII))

0 commit comments

Comments
 (0)