Skip to content

Conversation

@haonanya1
Copy link
Contributor

No description provided.

@haonanya1 haonanya1 requested a review from nikic as a code owner May 6, 2025 09:07
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels May 6, 2025
@llvmbot
Copy link
Member

llvmbot commented May 6, 2025

@llvm/pr-subscribers-llvm-transforms

Author: haonan (haonanya1)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/138661.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp (+3-1)
  • (added) llvm/test/Transforms/InstCombine/fold-phi-arg-gep-to-phi-negative.ll (+35)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
index 80308bf92dbbc..d29d5f21a3baa 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
@@ -575,7 +575,9 @@ Instruction *InstCombinerImpl::foldPHIArgGEPIntoPHI(PHINode &PN) {
       // variable index could pessimize the path.  This also handles the case
       // for struct indices, which must always be constant.
       if (isa<ConstantInt>(FirstInst->getOperand(Op)) ||
-          isa<ConstantInt>(GEP->getOperand(Op)))
+          isa<ConstantInt>(GEP->getOperand(Op)) ||
+          isa<ConstantDataVector>(FirstInst->getOperand(Op)) ||
+          isa<ConstantDataVector>(GEP->getOperand(Op)))
         return nullptr;
 
       if (FirstInst->getOperand(Op)->getType() !=
diff --git a/llvm/test/Transforms/InstCombine/fold-phi-arg-gep-to-phi-negative.ll b/llvm/test/Transforms/InstCombine/fold-phi-arg-gep-to-phi-negative.ll
new file mode 100644
index 0000000000000..1716321c0b4e7
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/fold-phi-arg-gep-to-phi-negative.ll
@@ -0,0 +1,35 @@
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+%vec = type { %vec_base }
+%vec_base = type { [4 x float] }
+%foo = type { %vec, %vec}
+
+define void @test(i1 %tobool, ptr addrspace(1) %add.ptr.i) {
+entry:
+  %lane.0 = alloca %foo, align 16
+  %lane.15 = insertelement <16 x ptr> undef, ptr %lane.0, i64 0
+  %mm_vectorGEP = getelementptr inbounds %foo, <16 x ptr> %lane.15, <16 x i64> zeroinitializer, <16 x i32> splat (i32 1), <16 x i32> zeroinitializer, <16 x i32> zeroinitializer, <16 x i64> splat (i64 1)
+  %mm_vectorGEP2 = getelementptr inbounds %foo, <16 x ptr> %lane.15, <16 x i64> zeroinitializer, <16 x i32> zeroinitializer, <16 x i32> zeroinitializer, <16 x i32> zeroinitializer, <16 x i64> splat (i64 1)
+  br i1 %tobool, label %f1, label %f0
+
+f0:
+; CHECK: f0:
+; CHECK-NEXT: %mm_vectorGEP = getelementptr inbounds %foo, <16 x ptr> %lane.15, <16 x i64> zeroinitializer, <16 x i32> splat (i32 1), <16 x i32> zeroinitializer, <16 x i32> zeroinitializer, <16 x i64> splat (i64 1)
+  br label %merge
+
+f1:
+; CHECK: f1:
+; CHECK-NEXT: %mm_vectorGEP2 = getelementptr inbounds %foo, <16 x ptr> %lane.15, <16 x i64> zeroinitializer, <16 x i32> zeroinitializer, <16 x i32> zeroinitializer, <16 x i32> zeroinitializer, <16 x i64> splat (i64 1)
+  br label %merge
+
+merge:
+; CHECK: merge:
+; CHECK-NEXT: %vec.phi14 = phi <16 x ptr> [ %mm_vectorGEP, %f0 ], [ %mm_vectorGEP2, %f1 ]
+  %vec.phi14 = phi <16 x ptr> [ %mm_vectorGEP, %f0], [ %mm_vectorGEP2, %f1 ]
+  %wide.masked.gather15 = call <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> %vec.phi14, i32 4, <16 x i1> splat (i1 true), <16 x float> poison)
+  %wide.masked.gather15.extract.15. = extractelement <16 x float> %wide.masked.gather15, i32 15
+  store float %wide.masked.gather15.extract.15., ptr addrspace(1) %add.ptr.i, align 4
+  ret void
+}
+
+declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr>, i32 immarg, <16 x i1>, <16 x float>)

@haonanya1
Copy link
Contributor Author

@nikic , the lit crash without the pr, can you please take a look? Thanks very much.

@github-actions
Copy link

github-actions bot commented May 6, 2025

✅ With the latest revision this PR passed the undef deprecator.

@haonanya1 haonanya1 force-pushed the foldPHIArgGEPIntoPHI branch from 6016775 to 8860620 Compare May 7, 2025 02:41
@haonanya1 haonanya1 requested a review from nikic May 7, 2025 02:45
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


%vec = type { %vec_base }
%vec_base = type { [4 x float] }
%foo = type { %vec, %vec}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to simplify the GEP type a bit? I don't think you need all of these levels of nesting. All you really need is a struct index using a vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks.


merge:
%vec.phi = phi <16 x ptr> [ %mm_vectorGEP, %f0], [ %mm_vectorGEP2, %f1 ]
store <16 x ptr> %vec.phi, ptr addrspace(1) %add.ptr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can replace the store with a return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ended up a bit too reduced now and no longer crashes without the change. This variant works:

%foo = type { i16, i16, i16 }

define <16 x ptr> @test(i1 %tobool) {
entry:
  %lane.0 = alloca %foo, align 16
  %lane.15 = insertelement <16 x ptr> poison, ptr %lane.0, i64 0
  %mm_vectorGEP = getelementptr inbounds %foo, <16 x ptr> %lane.15, <16 x i64> zeroinitializer, <16 x i32> splat (i32 1)
  %mm_vectorGEP2 = getelementptr inbounds %foo, <16 x ptr> %lane.15, <16 x i64> zeroinitializer, <16 x i32> splat (i32 2)
  br i1 %tobool, label %f1, label %f0

f0:
  br label %merge

f1:
  br label %merge

merge:
  %vec.phi = phi <16 x ptr> [ %mm_vectorGEP, %f0], [ %mm_vectorGEP2, %f1 ]
  ret <16 x ptr> %vec.phi
}

(This makes sure the second GEP is not optimized away.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience, updated.

@nikic nikic merged commit 5b8664f into llvm:main May 8, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants