Skip to content

[SCEV] Try to re-use pointer LCSSA phis when expanding SCEVs. #147824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 25, 2025

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Jul 9, 2025

Generalize the code added in
#147214 to also support
re-using pointer LCSSA phis when expanding integer SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with #142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.

Compile-time impact neutral: https://llvm-compile-time-tracker.com/compare.php?from=fd5fc76c91538871771be2c3be2ca3a5f2dcac31&to=ca5fc2b3d8e6efc09f1624a17fdbfbe909f14eb4&stat=instructions:u

@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Generalize the code added in
#147214 to also support
re-using pointer LCSSA phis when expanding integer SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with #142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.


Full diff: https://github.com/llvm/llvm-project/pull/147824.diff

5 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h (+1)
  • (modified) llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp (+38)
  • (modified) llvm/test/Transforms/LoopLoadElim/invalidate-laa-after-versioning.ll (+4-7)
  • (modified) llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll (+9-15)
  • (modified) llvm/test/Transforms/LoopVersioning/invalidate-laa-after-versioning.ll (+6-9)
diff --git a/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h b/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
index a101151eed7cc..39fef921a9590 100644
--- a/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
+++ b/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
@@ -530,6 +530,7 @@ class SCEVExpander : public SCEVVisitor<SCEVExpander, Value *> {
 
   bool isExpandedAddRecExprPHI(PHINode *PN, Instruction *IncV, const Loop *L);
 
+  Value *tryToReuseLCSSAPhi(const SCEVAddRecExpr *S);
   Value *expandAddRecExprLiterally(const SCEVAddRecExpr *);
   PHINode *getAddRecExprPHILiterally(const SCEVAddRecExpr *Normalized,
                                      const Loop *L, Type *&TruncTy,
diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index 24fe08d6c3e4e..ca9183b91a19b 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -1223,6 +1223,39 @@ Value *SCEVExpander::expandAddRecExprLiterally(const SCEVAddRecExpr *S) {
   return Result;
 }
 
+Value *SCEVExpander::tryToReuseLCSSAPhi(const SCEVAddRecExpr *S) {
+  Type *STy = S->getType();
+  const Loop *L = S->getLoop();
+  BasicBlock *EB = L->getExitBlock();
+  if (!EB || !EB->getSinglePredecessor() ||
+      !SE.DT.dominates(EB, Builder.GetInsertBlock()))
+    return nullptr;
+
+  for (auto &PN : EB->phis()) {
+    if (!SE.isSCEVable(PN.getType()))
+      continue;
+    auto *ExitSCEV = SE.getSCEV(&PN);
+    Type *PhiTy = PN.getType();
+    if (STy->isIntegerTy() && PhiTy->isPointerTy())
+      ExitSCEV= SE.getPtrToIntExpr(ExitSCEV, STy);
+    else if (S->getType() != PN.getType())
+      continue;
+    const SCEV *Diff = SE.getMinusSCEV(S, ExitSCEV);
+    if (isa<SCEVCouldNotCompute>(Diff) ||
+        SCEVExprContains(Diff,
+                         [](const SCEV *S) { return isa<SCEVAddRecExpr>(S); }))
+      continue;
+
+    Value *DiffV = expand(Diff);
+    Value *BaseV = &PN;
+    if (DiffV->getType()->isIntegerTy() && PhiTy->isPointerTy())
+      BaseV = Builder.CreatePtrToInt(BaseV, DiffV->getType());
+    return Builder.CreateAdd(BaseV, DiffV);
+  }
+
+  return nullptr;
+}
+
 Value *SCEVExpander::visitAddRecExpr(const SCEVAddRecExpr *S) {
   // In canonical mode we compute the addrec as an expression of a canonical IV
   // using evaluateAtIteration and expand the resulting SCEV expression. This
@@ -1262,6 +1295,11 @@ Value *SCEVExpander::visitAddRecExpr(const SCEVAddRecExpr *S) {
     return V;
   }
 
+  // If there S is expanded outside the defining loop, check if there is a
+  // matching LCSSA phi node for it.
+  if (Value *V = tryToReuseLCSSAPhi(S))
+    return V;
+
   // {X,+,F} --> X + {0,+,F}
   if (!S->getStart()->isZero()) {
     if (isa<PointerType>(S->getType())) {
diff --git a/llvm/test/Transforms/LoopLoadElim/invalidate-laa-after-versioning.ll b/llvm/test/Transforms/LoopLoadElim/invalidate-laa-after-versioning.ll
index 10e10653a431d..3ad262bb20910 100644
--- a/llvm/test/Transforms/LoopLoadElim/invalidate-laa-after-versioning.ll
+++ b/llvm/test/Transforms/LoopLoadElim/invalidate-laa-after-versioning.ll
@@ -59,19 +59,16 @@ define void @test(ptr %arg, i64 %arg1) {
 ; CHECK-NEXT:    [[GEP_5:%.*]] = getelementptr inbounds double, ptr [[LCSSA_PTR_IV_1]], i64 1
 ; CHECK-NEXT:    br label [[INNER_2:%.*]]
 ; CHECK:       inner.2:
-; CHECK-NEXT:    [[INDVAR:%.*]] = phi i64 [ [[INDVAR_NEXT:%.*]], [[INNER_2]] ], [ 0, [[INNER_1_EXIT]] ]
 ; CHECK-NEXT:    [[PTR_IV_2:%.*]] = phi ptr [ [[GEP_5]], [[INNER_1_EXIT]] ], [ [[PTR_IV_2_NEXT:%.*]], [[INNER_2]] ]
 ; CHECK-NEXT:    [[PTR_IV_2_NEXT]] = getelementptr inbounds double, ptr [[PTR_IV_2]], i64 1
-; CHECK-NEXT:    [[INDVAR_NEXT]] = add i64 [[INDVAR]], 1
 ; CHECK-NEXT:    br i1 false, label [[INNER_3_LVER_CHECK:%.*]], label [[INNER_2]]
 ; CHECK:       inner.3.lver.check:
-; CHECK-NEXT:    [[INDVAR_LCSSA:%.*]] = phi i64 [ [[INDVAR]], [[INNER_2]] ]
 ; CHECK-NEXT:    [[LCSSA_PTR_IV_2:%.*]] = phi ptr [ [[PTR_IV_2]], [[INNER_2]] ]
 ; CHECK-NEXT:    [[GEP_6:%.*]] = getelementptr inbounds double, ptr [[PTR_PHI]], i64 1
 ; CHECK-NEXT:    [[GEP_7:%.*]] = getelementptr inbounds double, ptr [[LCSSA_PTR_IV_2]], i64 1
-; CHECK-NEXT:    [[TMP0:%.*]] = shl i64 [[INDVAR_LCSSA]], 3
-; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[TMP0]], 24
-; CHECK-NEXT:    [[SCEVGEP3:%.*]] = getelementptr i8, ptr [[LCSSA_PTR_IV_1]], i64 [[TMP1]]
+; CHECK-NEXT:    [[TMP0:%.*]] = ptrtoint ptr [[LCSSA_PTR_IV_2]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[TMP0]], 16
+; CHECK-NEXT:    [[SCEVGEP3:%.*]] = inttoptr i64 [[TMP1]] to ptr
 ; CHECK-NEXT:    [[BOUND0:%.*]] = icmp ult ptr [[GEP_7]], [[GEP_1]]
 ; CHECK-NEXT:    [[BOUND1:%.*]] = icmp ult ptr [[PTR_PHI]], [[SCEVGEP3]]
 ; CHECK-NEXT:    [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
@@ -104,7 +101,7 @@ define void @test(ptr %arg, i64 %arg1) {
 ; CHECK-NEXT:    br i1 [[C_2]], label [[OUTER_LATCH_LOOPEXIT4:%.*]], label [[INNER_3]]
 ; CHECK:       outer.latch.loopexit:
 ; CHECK-NEXT:    br label [[OUTER_LATCH]]
-; CHECK:       outer.latch.loopexit4:
+; CHECK:       outer.latch.loopexit3:
 ; CHECK-NEXT:    br label [[OUTER_LATCH]]
 ; CHECK:       outer.latch:
 ; CHECK-NEXT:    br label [[INNER_1_LVER_CHECK]]
diff --git a/llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll b/llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll
index 2747895f06a7b..cd2e5dd8055f2 100644
--- a/llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll
+++ b/llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll
@@ -18,11 +18,9 @@ define void @reuse_lcssa_phi_for_add_rec1(ptr %head) {
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw i64 [[IV]], 1
 ; CHECK-NEXT:    br i1 [[EC_1]], label %[[PH:.*]], label %[[LOOP_1]]
 ; CHECK:       [[PH]]:
-; CHECK-NEXT:    [[IV_2_LCSSA:%.*]] = phi i32 [ [[IV_2]], %[[LOOP_1]] ]
 ; CHECK-NEXT:    [[IV_LCSSA:%.*]] = phi i64 [ [[IV]], %[[LOOP_1]] ]
-; CHECK-NEXT:    [[IV_2_NEXT_LCSSA:%.*]] = phi i32 [ [[IV_2_NEXT]], %[[LOOP_1]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = phi i32 [ [[IV_2_NEXT]], %[[LOOP_1]] ]
 ; CHECK-NEXT:    [[SRC_2:%.*]] = tail call noalias noundef dereferenceable_or_null(8) ptr @calloc(i64 1, i64 8)
-; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[IV_2_LCSSA]], 1
 ; CHECK-NEXT:    [[SMIN:%.*]] = call i32 @llvm.smin.i32(i32 [[TMP0]], i32 1)
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 [[TMP0]], [[SMIN]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
@@ -106,27 +104,23 @@ define void @runtime_checks_ptr_inductions(ptr %dst.1, ptr %dst.2, i1 %c) {
 ; CHECK-LABEL: define void @runtime_checks_ptr_inductions(
 ; CHECK-SAME: ptr [[DST_1:%.*]], ptr [[DST_2:%.*]], i1 [[C:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
-; CHECK-NEXT:    [[DST_11:%.*]] = ptrtoint ptr [[DST_1]] to i64
 ; CHECK-NEXT:    br label %[[LOOP_1:.*]]
 ; CHECK:       [[LOOP_1]]:
-; CHECK-NEXT:    [[INDVAR:%.*]] = phi i64 [ [[INDVAR_NEXT:%.*]], %[[LOOP_1]] ], [ 0, %[[ENTRY]] ]
 ; CHECK-NEXT:    [[PTR_IV_1:%.*]] = phi ptr [ [[DST_1]], %[[ENTRY]] ], [ [[PTR_IV_1_NEXT:%.*]], %[[LOOP_1]] ]
 ; CHECK-NEXT:    [[CALL:%.*]] = call i32 @val()
 ; CHECK-NEXT:    [[SEL_DST:%.*]] = select i1 [[C]], ptr [[DST_1]], ptr [[DST_2]]
 ; CHECK-NEXT:    [[PTR_IV_1_NEXT]] = getelementptr i8, ptr [[PTR_IV_1]], i64 1
 ; CHECK-NEXT:    [[EC_1:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT:    [[INDVAR_NEXT]] = add i64 [[INDVAR]], 1
 ; CHECK-NEXT:    br i1 [[EC_1]], label %[[LOOP_2_HEADER_PREHEADER:.*]], label %[[LOOP_1]]
 ; CHECK:       [[LOOP_2_HEADER_PREHEADER]]:
-; CHECK-NEXT:    [[SEL_DST_LCSSA2:%.*]] = phi ptr [ [[SEL_DST]], %[[LOOP_1]] ]
-; CHECK-NEXT:    [[INDVAR_LCSSA:%.*]] = phi i64 [ [[INDVAR]], %[[LOOP_1]] ]
+; CHECK-NEXT:    [[SEL_DST_LCSSA1:%.*]] = phi ptr [ [[SEL_DST]], %[[LOOP_1]] ]
 ; CHECK-NEXT:    [[PTR_IV_1_LCSSA:%.*]] = phi ptr [ [[PTR_IV_1]], %[[LOOP_1]] ]
 ; CHECK-NEXT:    [[SEL_DST_LCSSA:%.*]] = phi ptr [ [[SEL_DST]], %[[LOOP_1]] ]
-; CHECK-NEXT:    [[SEL_DST_LCSSA23:%.*]] = ptrtoint ptr [[SEL_DST_LCSSA2]] to i64
+; CHECK-NEXT:    [[SEL_DST_LCSSA12:%.*]] = ptrtoint ptr [[SEL_DST_LCSSA1]] to i64
 ; CHECK-NEXT:    br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_MEMCHECK:.*]]
 ; CHECK:       [[VECTOR_MEMCHECK]]:
-; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDVAR_LCSSA]], [[DST_11]]
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], [[SEL_DST_LCSSA23]]
+; CHECK-NEXT:    [[TMP0:%.*]] = ptrtoint ptr [[PTR_IV_1_LCSSA]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], [[SEL_DST_LCSSA12]]
 ; CHECK-NEXT:    [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP1]], 2
 ; CHECK-NEXT:    br i1 [[DIFF_CHECK]], label %[[SCALAR_PH]], label %[[VECTOR_PH:.*]]
 ; CHECK:       [[VECTOR_PH]]:
@@ -148,13 +142,13 @@ define void @runtime_checks_ptr_inductions(ptr %dst.1, ptr %dst.2, i1 %c) {
 ; CHECK-NEXT:    br label %[[SCALAR_PH]]
 ; CHECK:       [[SCALAR_PH]]:
 ; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i32 [ 1023, %[[MIDDLE_BLOCK]] ], [ 1, %[[LOOP_2_HEADER_PREHEADER]] ], [ 1, %[[VECTOR_MEMCHECK]] ]
-; CHECK-NEXT:    [[BC_RESUME_VAL5:%.*]] = phi ptr [ [[TMP2]], %[[MIDDLE_BLOCK]] ], [ [[PTR_IV_1_LCSSA]], %[[LOOP_2_HEADER_PREHEADER]] ], [ [[PTR_IV_1_LCSSA]], %[[VECTOR_MEMCHECK]] ]
-; CHECK-NEXT:    [[BC_RESUME_VAL6:%.*]] = phi ptr [ [[TMP3]], %[[MIDDLE_BLOCK]] ], [ [[SEL_DST_LCSSA]], %[[LOOP_2_HEADER_PREHEADER]] ], [ [[SEL_DST_LCSSA]], %[[VECTOR_MEMCHECK]] ]
+; CHECK-NEXT:    [[BC_RESUME_VAL4:%.*]] = phi ptr [ [[TMP2]], %[[MIDDLE_BLOCK]] ], [ [[PTR_IV_1_LCSSA]], %[[LOOP_2_HEADER_PREHEADER]] ], [ [[PTR_IV_1_LCSSA]], %[[VECTOR_MEMCHECK]] ]
+; CHECK-NEXT:    [[BC_RESUME_VAL5:%.*]] = phi ptr [ [[TMP3]], %[[MIDDLE_BLOCK]] ], [ [[SEL_DST_LCSSA]], %[[LOOP_2_HEADER_PREHEADER]] ], [ [[SEL_DST_LCSSA]], %[[VECTOR_MEMCHECK]] ]
 ; CHECK-NEXT:    br label %[[LOOP_2_HEADER:.*]]
 ; CHECK:       [[LOOP_2_HEADER]]:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[DEC7:%.*]], %[[LOOP_2_LATCH:.*]] ], [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ]
-; CHECK-NEXT:    [[PTR_IV_2:%.*]] = phi ptr [ [[PTR_IV_2_NEXT:%.*]], %[[LOOP_2_LATCH]] ], [ [[BC_RESUME_VAL5]], %[[SCALAR_PH]] ]
-; CHECK-NEXT:    [[PTR_IV_3:%.*]] = phi ptr [ [[PTR_IV_3_NEXT:%.*]], %[[LOOP_2_LATCH]] ], [ [[BC_RESUME_VAL6]], %[[SCALAR_PH]] ]
+; CHECK-NEXT:    [[PTR_IV_2:%.*]] = phi ptr [ [[PTR_IV_2_NEXT:%.*]], %[[LOOP_2_LATCH]] ], [ [[BC_RESUME_VAL4]], %[[SCALAR_PH]] ]
+; CHECK-NEXT:    [[PTR_IV_3:%.*]] = phi ptr [ [[PTR_IV_3_NEXT:%.*]], %[[LOOP_2_LATCH]] ], [ [[BC_RESUME_VAL5]], %[[SCALAR_PH]] ]
 ; CHECK-NEXT:    [[EC_2:%.*]] = icmp eq i32 [[IV]], 1024
 ; CHECK-NEXT:    br i1 [[EC_2]], label %[[EXIT:.*]], label %[[LOOP_2_LATCH]]
 ; CHECK:       [[LOOP_2_LATCH]]:
diff --git a/llvm/test/Transforms/LoopVersioning/invalidate-laa-after-versioning.ll b/llvm/test/Transforms/LoopVersioning/invalidate-laa-after-versioning.ll
index 8075314a65b49..858864276c0a0 100644
--- a/llvm/test/Transforms/LoopVersioning/invalidate-laa-after-versioning.ll
+++ b/llvm/test/Transforms/LoopVersioning/invalidate-laa-after-versioning.ll
@@ -56,19 +56,16 @@ define void @test(ptr %arg, i64 %arg1) {
 ; CHECK-NEXT:    [[GEP_5:%.*]] = getelementptr inbounds double, ptr [[LCSSA_PTR_IV_1]], i64 1
 ; CHECK-NEXT:    br label [[INNER_2:%.*]]
 ; CHECK:       inner.2:
-; CHECK-NEXT:    [[INDVAR:%.*]] = phi i64 [ [[INDVAR_NEXT:%.*]], [[INNER_2]] ], [ 0, [[INNER_1_EXIT]] ]
 ; CHECK-NEXT:    [[PTR_IV_2:%.*]] = phi ptr [ [[GEP_5]], [[INNER_1_EXIT]] ], [ [[PTR_IV_2_NEXT:%.*]], [[INNER_2]] ]
 ; CHECK-NEXT:    [[PTR_IV_2_NEXT]] = getelementptr inbounds double, ptr [[PTR_IV_2]], i64 1
-; CHECK-NEXT:    [[INDVAR_NEXT]] = add i64 [[INDVAR]], 1
 ; CHECK-NEXT:    br i1 false, label [[INNER_3_LVER_CHECK:%.*]], label [[INNER_2]]
 ; CHECK:       inner.3.lver.check:
-; CHECK-NEXT:    [[INDVAR_LCSSA:%.*]] = phi i64 [ [[INDVAR]], [[INNER_2]] ]
 ; CHECK-NEXT:    [[LCSSA_PTR_IV_2:%.*]] = phi ptr [ [[PTR_IV_2]], [[INNER_2]] ]
 ; CHECK-NEXT:    [[GEP_6:%.*]] = getelementptr inbounds double, ptr [[PTR_PHI]], i64 1
 ; CHECK-NEXT:    [[GEP_7:%.*]] = getelementptr inbounds double, ptr [[LCSSA_PTR_IV_2]], i64 1
-; CHECK-NEXT:    [[TMP0:%.*]] = shl i64 [[INDVAR_LCSSA]], 3
-; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[TMP0]], 24
-; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i8, ptr [[LCSSA_PTR_IV_1]], i64 [[TMP1]]
+; CHECK-NEXT:    [[TMP0:%.*]] = ptrtoint ptr [[LCSSA_PTR_IV_2]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[TMP0]], 16
+; CHECK-NEXT:    [[SCEVGEP:%.*]] = inttoptr i64 [[TMP1]] to ptr
 ; CHECK-NEXT:    [[BOUND0:%.*]] = icmp ult ptr [[GEP_7]], [[GEP_1]]
 ; CHECK-NEXT:    [[BOUND1:%.*]] = icmp ult ptr [[PTR_PHI]], [[SCEVGEP]]
 ; CHECK-NEXT:    [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
@@ -90,10 +87,10 @@ define void @test(ptr %arg, i64 %arg1) {
 ; CHECK:       inner.3:
 ; CHECK-NEXT:    [[IV_2:%.*]] = phi i64 [ 0, [[INNER_3_PH]] ], [ [[IV_2_NEXT:%.*]], [[INNER_3]] ]
 ; CHECK-NEXT:    [[GEP_8:%.*]] = getelementptr inbounds double, ptr [[GEP_6]], i64 [[IV_2]]
-; CHECK-NEXT:    store double 0.000000e+00, ptr [[GEP_7]], align 8, !alias.scope !0, !noalias !3
-; CHECK-NEXT:    store double 0.000000e+00, ptr [[GEP_8]], align 8, !alias.scope !3
+; CHECK-NEXT:    store double 0.000000e+00, ptr [[GEP_7]], align 8, !alias.scope [[META0:![0-9]+]], !noalias [[META3:![0-9]+]]
+; CHECK-NEXT:    store double 0.000000e+00, ptr [[GEP_8]], align 8, !alias.scope [[META3]]
 ; CHECK-NEXT:    [[GEP_9:%.*]] = getelementptr double, ptr [[PTR_PHI]], i64 [[IV_2]]
-; CHECK-NEXT:    [[TMP18:%.*]] = load double, ptr [[GEP_9]], align 8, !alias.scope !3
+; CHECK-NEXT:    [[TMP18:%.*]] = load double, ptr [[GEP_9]], align 8, !alias.scope [[META3]]
 ; CHECK-NEXT:    [[IV_2_NEXT]] = add nuw nsw i64 [[IV_2]], 1
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp eq i64 [[IV_2]], 1
 ; CHECK-NEXT:    br i1 [[C_2]], label [[OUTER_LATCH_LOOPEXIT3:%.*]], label [[INNER_3]]

Copy link

github-actions bot commented Jul 9, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@fhahn fhahn force-pushed the perf/scevexp-reuse-lcssa branch from 7386b29 to 88e170a Compare July 9, 2025 20:49
@fhahn fhahn force-pushed the perf/scevexp-reuse-lcssa branch 2 times, most recently from ca5fc2b to 4b53050 Compare July 22, 2025 11:05
Copy link
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping, now that #147214 landed.

This simplifies runtime checks in a number of cases already, and prevents
regressions with #142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.

if (isa<SCEVCouldNotCompute>(Diff) ||
SCEVExprContains(Diff,
[](const SCEV *S) { return isa<SCEVAddRecExpr>(S); }))
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we limit this to just constant offsets (and use computeConstantDifference)? This check excludes one particularly bad case, but other complex expansions may also be non-profitable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that checking just for not containing add-recs might be a bit over-optimistic. Restricting to constant difference on the other hand would mean we miss other profitable cases.

I added a restricted this now to only allow SCEVConstant/SCEVUnknown values, and PtrToInt/negations of those. This should cover all cases I found for now, on a large test set with vectorization enabled .

; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[LCSSA_PTR_IV_1]], i64 [[TMP1]]
; CHECK-NEXT: [[TMP0:%.*]] = ptrtoint ptr [[LCSSA_PTR_IV_2]] to i64
; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[TMP0]], 16
; CHECK-NEXT: [[SCEVGEP:%.*]] = inttoptr i64 [[TMP1]] to ptr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really shouldn't be emitting ptrtoint+add+inttoptr. Why do we end up with this instead of a GEP? I'd have expected this to not happen as both the phi and the final result are pointers...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep updated to emit a GEP in that case. If the expression itself is used as int again, doing to ptrtoint + add is a bit more compact, but there's no easy way to detect this at this point.

fhahn added a commit that referenced this pull request Jul 25, 2025
@fhahn fhahn force-pushed the perf/scevexp-reuse-lcssa branch from 4b53050 to 535fb71 Compare July 25, 2025 09:27
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 25, 2025
Value *DiffV = expand(Diff);
Value *BaseV = &PN;
if (DiffV->getType()->isIntegerTy() && PhiTy->isPointerTy())
return Builder.CreatePtrAdd(BaseV, DiffV);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand how the int addrec with ptr phi case works. As far as I can tell this is going to produce a ptradd here, but doesn't the result need to be an integer? I see in the tests that there is a ptrtoint, but I don't understand where it is coming from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was broken by the patch, the cases in the tests should be pointer AddRecs, that are than convered via PtrToInt.

I added a new test case (llvm/test/Transforms/LoopIdiom/reuse-lcssa-phi-scev-expansion.ll) which has a pointer phi, with the integer AddRec. Handle by creating a PtrToInt here, if S is has integer type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though actually, do you really need that mixed integer/pointer support? It occurred to me that this isn't going to work for non-integral pointers. I'd rather not have it if it's not critical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortuantely a common case where this helps is when the phi is a pointer but the SCEV is of integer type. This is generated by LV to compute and check the distance between 2 pointers (example is in https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll#L122).

For now we will only generate this for integral pointers, so I'm not sure if there's a good way to write a test for it. I could add a speculative continue to skip non-integral pointers?

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 25, 2025
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fhahn added 5 commits July 25, 2025 14:08
Generalize the code added in
llvm#147214 to also support
re-using pointer LCSSA phis when expanding integer SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with llvm#142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.
@fhahn fhahn force-pushed the perf/scevexp-reuse-lcssa branch from 6854a9f to a91517b Compare July 25, 2025 13:12
@fhahn fhahn merged commit e21ee41 into llvm:main Jul 25, 2025
9 checks passed
@fhahn fhahn deleted the perf/scevexp-reuse-lcssa branch July 25, 2025 14:29
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 25, 2025
…Vs. (#147824)

Generalize the code added in
llvm/llvm-project#147214 to also support
re-using pointer LCSSA phis when expanding SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with llvm/llvm-project#142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.

Compile-time impact neutral:
https://llvm-compile-time-tracker.com/compare.php?from=fd5fc76c91538871771be2c3be2ca3a5f2dcac31&to=ca5fc2b3d8e6efc09f1624a17fdbfbe909f14eb4&stat=instructions:u

PR: llvm/llvm-project#147824
fhahn added a commit that referenced this pull request Jul 25, 2025
Add another test case for
#147824, where the difference
between an existing phi and the target SCEV is an add of a constant.
fhahn added a commit to fhahn/llvm-project that referenced this pull request Jul 25, 2025
Update the logic added in
llvm#147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 25, 2025
Add another test case for
llvm/llvm-project#147824, where the difference
between an existing phi and the target SCEV is an add of a constant.
@nathanchance
Copy link
Member

I am seeing an assertion failure when building the Linux kernel for Hexagon after this change.

# bad: [7d7f3819e0bf3adfbd77af3c6fa454636faa274c] Revert "[OMPIRBuilder] Don't use invalid debug loc in reduction functions." (#150832)
# good: [076d3050f1f85679f505405eefd0c2cd1ad6f92b] [RISCV] Merge verifyDagOpCount into addDagOperandMapping in CompressInstEmitter. (#150548)
git bisect start '7d7f3819e0bf3adfbd77af3c6fa454636faa274c' '076d3050f1f85679f505405eefd0c2cd1ad6f92b'
# bad: [284a5c2c0b97edddf255ea210f939203ad3d09f2] [mlir][NFC] update `mlir/examples` create APIs (31/n) (#150652)
git bisect bad 284a5c2c0b97edddf255ea210f939203ad3d09f2
# bad: [5294793bdcf6ca142f7a0df897638bd4e85ed1a7] Revert "[RISCV][TTI] Enable masked interleave access for scalable vector (#149981)"
git bisect bad 5294793bdcf6ca142f7a0df897638bd4e85ed1a7
# good: [33f4582e8d128eac6b699564ecddfef5c553288e] [llvm] [Demangle] Fix a typo in the definition of DEMANGLE_ABI for dllimport
git bisect good 33f4582e8d128eac6b699564ecddfef5c553288e
# good: [b75530ff034a131da8ca1f05a00f3655c13839ff] [LoopInterchange] Consider forward/backward dependency in vectorize heuristic (#133672)
git bisect good b75530ff034a131da8ca1f05a00f3655c13839ff
# bad: [e21ee41be450f849f5247aafa07d7f4c3941bb9d] [SCEV] Try to re-use pointer LCSSA phis when expanding SCEVs. (#147824)
git bisect bad e21ee41be450f849f5247aafa07d7f4c3941bb9d
# good: [c1545b68bcba16c3d21fd3d0ee3bc4c92aa8d98f] Reapply [BranchFolding] Kill common hoisted debug instructions (#149999)
git bisect good c1545b68bcba16c3d21fd3d0ee3bc4c92aa8d98f
# good: [cdb67e11313fe3f848599922774728d2e65f7cc9] [libc++][NFC] Make __is_segmented_iterator a variable template (#149976)
git bisect good cdb67e11313fe3f848599922774728d2e65f7cc9
# good: [e4963834e44b2d41d1d6bce0c7c585a4c0b7bf86] [MemProf] Include caller clone information in dot graph nodes (#150492)
git bisect good e4963834e44b2d41d1d6bce0c7c585a4c0b7bf86
# first bad commit: [e21ee41be450f849f5247aafa07d7f4c3941bb9d] [SCEV] Try to re-use pointer LCSSA phis when expanding SCEVs. (#147824)

cvise spits out:

struct list_head {
  struct list_head *next;
} __list_add_prev_0, *cell_sort_array, *sort_cells_cell, sort_cells_tmp,
    process_thin_deferred_cells_cells;
int sort_cells_count, process_thin_deferred_cells___trans_tmp_15,
    process_thin_deferred_cells_j, process_thin_deferred_cells_count,
    process_deferred_bios_tc;
int list_empty();
int sort_cells(struct list_head *cells) {
  sort_cells_cell = ({
    void *__mptr = cells;
    __mptr;
  });
  for (; !(sort_cells_cell == 0);)
    cell_sort_array[sort_cells_count++] = sort_cells_tmp;
  return sort_cells_count;
}
void process_thin_deferred_cells() {
  do {
    process_thin_deferred_cells_count =
        sort_cells(&process_thin_deferred_cells_cells);
    if (process_thin_deferred_cells___trans_tmp_15) {
      process_thin_deferred_cells_j = 0;
      for (; process_thin_deferred_cells_j < process_thin_deferred_cells_count;
           process_thin_deferred_cells_j++)
        *(volatile typeof(__list_add_prev_0) *)0;
      return;
    }
  } while (list_empty());
}
void process_deferred_bios() {
  while (process_deferred_bios_tc)
    process_thin_deferred_cells();
}
$ clang --target=hexagon-linux -O2 -c -o /dev/null dm-thin.i
clang: llvm/lib/Transforms/Utils/LoopSimplify.cpp:709: bool llvm::simplifyLoop(Loop *, DominatorTree *, LoopInfo *, ScalarEvolution *, AssumptionCache *, MemorySSAUpdater *, bool): Assertion `L->isRecursivelyLCSSAForm(*DT, *LI) && "Requested to preserve LCSSA, but it's already broken."' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang --target=hexagon-linux -O2 -c -o /dev/null dm-thin.i
1.	<eof> parser at end of file
2.	Optimizer
3.	Running pass "function<eager-inv>(float2int,lower-constant-intrinsics,loop(loop-rotate<header-duplication;no-prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,slp-vectorizer,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O2>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>)" on module "dm-thin.i"
4.	Running pass "loop-unroll<O2>" on function "process_deferred_bios"
 #0 0x000055da77863d78 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (clang-22+0x3a23d78)
 #1 0x000055da778614b5 llvm::sys::RunSignalHandlers() (clang-22+0x3a214b5)
 #2 0x000055da777e2ab6 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007f71f6e4def0 (/usr/lib/libc.so.6+0x3def0)
 #4 0x00007f71f6ea774c (/usr/lib/libc.so.6+0x9774c)
 #5 0x00007f71f6e4ddc0 raise (/usr/lib/libc.so.6+0x3ddc0)
 #6 0x00007f71f6e3557a abort (/usr/lib/libc.so.6+0x2557a)
 #7 0x00007f71f6e354e3 __assert_perror_fail (/usr/lib/libc.so.6+0x254e3)
 #8 0x000055da7793246d (clang-22+0x3af246d)
 #9 0x000055da7793d375 llvm::UnrollLoop(llvm::Loop*, llvm::UnrollLoopOptions, llvm::LoopInfo*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::TargetTransformInfo const*, llvm::OptimizationRemarkEmitter*, bool, llvm::Loop**, llvm::AAResults*) (clang-22+0x3afd375)
#10 0x000055da77706ef5 tryToUnrollLoop(llvm::Loop*, llvm::DominatorTree&, llvm::LoopInfo*, llvm::ScalarEvolution&, llvm::TargetTransformInfo const&, llvm::AssumptionCache&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::ProfileSummaryInfo*, bool, int, bool, bool, bool, std::optional<unsigned int>, std::optional<unsigned int>, std::optional<bool>, std::optional<bool>, std::optional<bool>, std::optional<bool>, std::optional<bool>, std::optional<unsigned int>, llvm::AAResults*) LoopUnrollPass.cpp:0:0
#11 0x000055da777077d7 llvm::LoopUnrollPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang-22+0x38c77d7)
#12 0x000055da789312ed llvm::detail::PassModel<llvm::Function, llvm::LoopUnrollPass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilder.cpp:0:0
#13 0x000055da773af557 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (clang-22+0x356f557)
#14 0x000055da768b549d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) X86CodeGenPassBuilder.cpp:0:0
#15 0x000055da773b2381 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang-22+0x3572381)
#16 0x000055da768b560d llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) X86CodeGenPassBuilder.cpp:0:0
#17 0x000055da773ae777 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (clang-22+0x356e777)
#18 0x000055da77fac07e (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&, clang::BackendConsumer*) BackendUtil.cpp:0:0
#19 0x000055da77fa362a clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (clang-22+0x416362a)
#20 0x000055da77fb8753 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (clang-22+0x4178753)
#21 0x000055da79615309 clang::ParseAST(clang::Sema&, bool, bool) (clang-22+0x57d5309)
#22 0x000055da7852e7c6 clang::FrontendAction::Execute() (clang-22+0x46ee7c6)
#23 0x000055da7849d3ed clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (clang-22+0x465d3ed)
#24 0x000055da7860042c clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (clang-22+0x47c042c)
#25 0x000055da763a0b07 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (clang-22+0x2560b07)
#26 0x000055da7639c99f ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#27 0x000055da78302d39 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#28 0x000055da777e279e llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (clang-22+0x39a279e)
#29 0x000055da78302573 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (clang-22+0x44c2573)
#30 0x000055da782c3d2c clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (clang-22+0x4483d2c)
#31 0x000055da782c3f47 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (clang-22+0x4483f47)
#32 0x000055da782e06a8 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (clang-22+0x44a06a8)
#33 0x000055da7639c243 clang_main(int, char**, llvm::ToolContext const&) (clang-22+0x255c243)
#34 0x000055da763ac9a7 main (clang-22+0x256c9a7)
#35 0x00007f71f6e376b5 (/usr/lib/libc.so.6+0x276b5)
#36 0x00007f71f6e37769 __libc_start_main (/usr/lib/libc.so.6+0x27769)
#37 0x000055da7639a425 _start (clang-22+0x255a425)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)

llvm-reduce spits out:

target datalayout = "e-m:e-p:32:32:32-a:0-n16:32-i64:64:64-i32:32:32-i16:16:16-i1:8:8-f32:32:32-f64:64:64-v32:32:32-v64:64:64-v512:512:512-v1024:1024:1024-v2048:2048:2048"
target triple = "hexagon-unknown-linux"

%struct.list_head = type { ptr }

@sort_cells_cell = external global ptr
@sort_cells_count = external global i32
@process_thin_deferred_cells_cells = external global %struct.list_head
@process_thin_deferred_cells___trans_tmp_15 = external global i32

define i32 @sort_cells(ptr %cells) {
entry:
  store ptr null, ptr @sort_cells_cell, align 4
  br label %for.cond

for.cond:                                         ; preds = %for.body, %entry
  %0 = phi ptr [ %.pre, %for.body ], [ %cells, %entry ]
  %cmp.not = icmp eq ptr %0, null
  br i1 %cmp.not, label %for.end, label %for.body

for.body:                                         ; preds = %for.cond
  %1 = load ptr, ptr %cells, align 4
  %2 = load i32, ptr @sort_cells_count, align 4, !tbaa !0
  %inc = add i32 %2, 1
  store i32 %inc, ptr @sort_cells_count, align 4, !tbaa !0
  store i32 0, ptr %1, align 4, !tbaa !4
  %.pre = load ptr, ptr %cells, align 4
  br label %for.cond

for.end:                                          ; preds = %for.cond
  %3 = load i32, ptr @sort_cells_count, align 4
  ret i32 %3
}

define void @process_thin_deferred_cells() {
entry:
  %agg.tmp.ensured.sroa.0 = alloca ptr, align 4
  br label %do.body

do.body:                                          ; preds = %do.body, %entry
  %call1 = call i32 @sort_cells(ptr @process_thin_deferred_cells_cells)
  %0 = load i32, ptr @process_thin_deferred_cells___trans_tmp_15, align 4
  %tobool.not = icmp eq i32 %0, 0
  br i1 %tobool.not, label %do.body, label %for.cond

for.cond:                                         ; preds = %for.body, %do.body
  %1 = phi i32 [ %inc, %for.body ], [ 0, %do.body ]
  %cmp = icmp slt i32 %1, %call1
  br i1 %cmp, label %for.body, label %for.end

for.body:                                         ; preds = %for.cond
  store volatile ptr null, ptr %agg.tmp.ensured.sroa.0, align 4
  %inc = add i32 %1, 1
  br label %for.cond

for.end:                                          ; preds = %for.cond
  ret void
}

define void @process_deferred_bios() {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.cond, %entry
  call void @process_thin_deferred_cells()
  br label %while.cond
}

!0 = !{!1, !1, i64 0}
!1 = !{!"int", !2, i64 0}
!2 = !{!"omnipotent char", !3, i64 0}
!3 = !{!"Simple C/C++ TBAA"}
!4 = !{!5, !5, i64 0}
!5 = !{!"p1 _ZTS9list_head", !6, i64 0}
!6 = !{!"any pointer", !2, i64 0}
$ opt -O3 -disable-output reduced.ll
opt: llvm/lib/Transforms/Utils/LoopSimplify.cpp:709: bool llvm::simplifyLoop(Loop *, DominatorTree *, LoopInfo *, ScalarEvolution *, AssumptionCache *, MemorySSAUpdater *, bool): Assertion `L->isRecursivelyLCSSAForm(*DT, *LI) && "Requested to preserve LCSSA, but it's already broken."' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: opt -O3 -disable-output reduced.ll
1.	Running pass "function<eager-inv>(float2int,lower-constant-intrinsics,chr,loop(loop-rotate<header-duplication;no-prepare-for-lto>,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>,infer-alignment,loop-load-elim,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>,slp-vectorizer,vector-combine,instcombine<max-iterations=1;no-verify-fixpoint>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,infer-alignment,instcombine<max-iterations=1;no-verify-fixpoint>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;speculate-unpredictables>)" on module "reduced.ll"
2.	Running pass "loop-unroll<O3>" on function "process_deferred_bios"
 #0 0x000055d7ff4ccd38 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (opt+0x1c78d38)
 #1 0x000055d7ff4ca2e5 llvm::sys::RunSignalHandlers() (opt+0x1c762e5)
 #2 0x000055d7ff4cddd1 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x00007ff4fec4def0 (/usr/lib/libc.so.6+0x3def0)
 #4 0x00007ff4feca774c (/usr/lib/libc.so.6+0x9774c)
 #5 0x00007ff4fec4ddc0 raise (/usr/lib/libc.so.6+0x3ddc0)
 #6 0x00007ff4fec3557a abort (/usr/lib/libc.so.6+0x2557a)
 #7 0x00007ff4fec354e3 __assert_perror_fail (/usr/lib/libc.so.6+0x254e3)
 #8 0x000055d7ffc1556d (opt+0x23c156d)
 #9 0x000055d800935935 llvm::UnrollLoop(llvm::Loop*, llvm::UnrollLoopOptions, llvm::LoopInfo*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::TargetTransformInfo const*, llvm::OptimizationRemarkEmitter*, bool, llvm::Loop**, llvm::AAResults*) (opt+0x30e1935)
#10 0x000055d80092aa25 tryToUnrollLoop(llvm::Loop*, llvm::DominatorTree&, llvm::LoopInfo*, llvm::ScalarEvolution&, llvm::TargetTransformInfo const&, llvm::AssumptionCache&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::ProfileSummaryInfo*, bool, int, bool, bool, bool, std::optional<unsigned int>, std::optional<unsigned int>, std::optional<bool>, std::optional<bool>, std::optional<bool>, std::optional<bool>, std::optional<bool>, std::optional<unsigned int>, llvm::AAResults*) LoopUnrollPass.cpp:0:0
#11 0x000055d80092b307 llvm::LoopUnrollPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (opt+0x30d7307)
#12 0x000055d800c9ebbd llvm::detail::PassModel<llvm::Function, llvm::LoopUnrollPass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilderPipelines.cpp:0:0
#13 0x000055d7ff7111c7 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (opt+0x1ebd1c7)
#14 0x000055d80084b1ad llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) X86CodeGenPassBuilder.cpp:0:0
#15 0x000055d7ff715c71 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (opt+0x1ec1c71)
#16 0x000055d80084b31d llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) X86CodeGenPassBuilder.cpp:0:0
#17 0x000055d7ff70ff87 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (opt+0x1ebbf87)
#18 0x000055d800c3f544 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) (opt+0x33eb544)
#19 0x000055d7ff4a3c7d optMain (opt+0x1c4fc7d)
#20 0x00007ff4fec376b5 (/usr/lib/libc.so.6+0x276b5)
#21 0x00007ff4fec37769 __libc_start_main (/usr/lib/libc.so.6+0x27769)
#22 0x000055d7ff49d125 _start (opt+0x1c49125)

mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
…47824)

Generalize the code added in
llvm#147214 to also support
re-using pointer LCSSA phis when expanding SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with llvm#142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.

Compile-time impact neutral:
https://llvm-compile-time-tracker.com/compare.php?from=fd5fc76c91538871771be2c3be2ca3a5f2dcac31&to=ca5fc2b3d8e6efc09f1624a17fdbfbe909f14eb4&stat=instructions:u

PR: llvm#147824
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
Add another test case for
llvm#147824, where the difference
between an existing phi and the target SCEV is an add of a constant.
fhahn added a commit that referenced this pull request Jul 28, 2025
If we insert a new add instruction, it may introduce a new use outside
the loop that contains the phi node we re-use. Use fixupLCSSAFormFor to
fix LCSSA form, if needed.

This fixes a crash reported in
#147824 (comment).
@fhahn
Copy link
Contributor Author

fhahn commented Jul 28, 2025

@nathanchance thanks for the report! Should be fixed in f9f68af by fixing up LCSSA form if needed.

ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jul 28, 2025
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jul 28, 2025
…47824)

Generalize the code added in
llvm#147214 to also support
re-using pointer LCSSA phis when expanding SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with llvm#142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.

Compile-time impact neutral:
https://llvm-compile-time-tracker.com/compare.php?from=fd5fc76c91538871771be2c3be2ca3a5f2dcac31&to=ca5fc2b3d8e6efc09f1624a17fdbfbe909f14eb4&stat=instructions:u

PR: llvm#147824
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 28, 2025
…eeded.

If we insert a new add instruction, it may introduce a new use outside
the loop that contains the phi node we re-use. Use fixupLCSSAFormFor to
fix LCSSA form, if needed.

This fixes a crash reported in
llvm/llvm-project#147824 (comment).
fhahn added a commit to fhahn/llvm-project that referenced this pull request Jul 31, 2025
Update the logic added in
llvm#147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).
fhahn added a commit to fhahn/llvm-project that referenced this pull request Jul 31, 2025
Update the logic added in
llvm#147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).
fhahn added a commit that referenced this pull request Jul 31, 2025
Update the logic added in
#147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).

PR: #150693
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 31, 2025
…0693)

Update the logic added in
llvm/llvm-project#147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).

PR: llvm/llvm-project#150693
krishna2803 pushed a commit to krishna2803/llvm-project that referenced this pull request Aug 12, 2025
Update the logic added in
llvm#147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).

PR: llvm#150693
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants