[SCEV] Rewrite more SCEVAddExpr when applying guards. #159942

fhahn · 2025-09-20T20:25:19Z

When re-writing SCEVAddExprs to apply information from guards, check if
we have information for the expression itself. If so, apply it.

When we have an expression of the form (Const + A), check if we have
have guard info for (Const + 1 + A) and use it. This is needed to avoid
regressions in a few cases, where we have BTCs with a subtracted
constant.

Rewriting expressions could cause regressions, e.g. when comparing 2
SCEV expressions where we are only able to rewrite one side, but I could
not find any cases where this happens more with this patch in practice.

Depends on #160012 (included in
PR)

Proofs for some of the test changes: https://alive2.llvm.org/ce/z/RPX6t_

llvmbot · 2025-09-22T10:39:50Z

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

When re-writing SCEVAddExprs to apply information from guards, check if
we have information for the expression itself. If so, apply it.

When we have an expression of the form (Const + A), check if we have
have guard info for (Const + 1 + A) and use it. This is needed to avoid
regressions in a few cases, where we have BTCs with a subtracted
constant.

Rewriting expressions could cause regressions, e.g. when comparing 2
SCEV expressions where we are only able to rewrite one side, but I could
not find any cases where this happens more with this patch in practice.

Depends on #160012 (included in
PR)

Proofs for some of the test changes: https://alive2.llvm.org/ce/z/RPX6t_

Full diff: https://github.com/llvm/llvm-project/pull/159942.diff

7 Files Affected:

(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+47-11)
(modified) llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll (+3-3)
(modified) llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll (+8-8)
(modified) llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll (+2-4)
(modified) llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll (+3-3)
(modified) llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll (+1-6)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index b08399b381f34..09a31b105e128 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -15850,12 +15850,17 @@ void ScalarEvolution::LoopGuards::collectFromBlock(
         To = SE.getUMaxExpr(FromRewritten, RHS);
         if (auto *UMin = dyn_cast<SCEVUMinExpr>(FromRewritten))
           EnqueueOperands(UMin);
+        if (RHS->isOne())
+          ExprsToRewrite.push_back(From);
         break;
       case CmpInst::ICMP_SGT:
       case CmpInst::ICMP_SGE:
         To = SE.getSMaxExpr(FromRewritten, RHS);
-        if (auto *SMin = dyn_cast<SCEVSMinExpr>(FromRewritten))
+        if (auto *SMin = dyn_cast<SCEVSMinExpr>(FromRewritten)) {
           EnqueueOperands(SMin);
+        }
+        if (RHS->isOne())
+          ExprsToRewrite.push_back(From);
         break;
       case CmpInst::ICMP_EQ:
         if (isa<SCEVConstant>(RHS))
@@ -15986,7 +15991,22 @@ void ScalarEvolution::LoopGuards::collectFromBlock(
     for (const SCEV *Expr : ExprsToRewrite) {
       const SCEV *RewriteTo = Guards.RewriteMap[Expr];
       Guards.RewriteMap.erase(Expr);
-      Guards.RewriteMap.insert({Expr, Guards.rewrite(RewriteTo)});
+      const SCEV *Rewritten = Guards.rewrite(RewriteTo);
+
+      // Try to strengthen divisibility of SMax/UMax expressions coming from >=
+      // 1 conditions.
+      if (auto *SMax = dyn_cast<SCEVSMaxExpr>(Rewritten)) {
+        unsigned MinTrailingZeros = SE.getMinTrailingZeros(SMax->getOperand(1));
+        for (const SCEV *Op : drop_begin(SMax->operands(), 2))
+          MinTrailingZeros =
+              std::min(MinTrailingZeros, SE.getMinTrailingZeros(Op));
+        if (MinTrailingZeros != 0)
+          Rewritten = SE.getSMaxExpr(
+              SE.getConstant(APInt(SMax->getType()->getScalarSizeInBits(), 1)
+                                 .shl(MinTrailingZeros)),
+              SMax);
+      }
+      Guards.RewriteMap.insert({Expr, Rewritten});
     }
   }
 }
@@ -16059,16 +16079,32 @@ const SCEV *ScalarEvolution::LoopGuards::rewrite(const SCEV *Expr) const {
     }
 
     const SCEV *visitAddExpr(const SCEVAddExpr *Expr) {
-      // Trip count expressions sometimes consist of adding 3 operands, i.e.
-      // (Const + A + B). There may be guard info for A + B, and if so, apply
-      // it.
-      // TODO: Could more generally apply guards to Add sub-expressions.
-      if (isa<SCEVConstant>(Expr->getOperand(0)) &&
-          Expr->getNumOperands() == 3) {
-        if (const SCEV *S = Map.lookup(
-                SE.getAddExpr(Expr->getOperand(1), Expr->getOperand(2))))
-          return SE.getAddExpr(Expr->getOperand(0), S);
+      if (const SCEV *S = Map.lookup(Expr))
+        return S;
+      if (isa<SCEVConstant>(Expr->getOperand(0))) {
+        // Trip count expressions sometimes consist of adding 3 operands, i.e.
+        // (Const + A + B). There may be guard info for A + B, and if so, apply
+        // it.
+        // TODO: Could more generally apply guards to Add sub-expressions.
+        if (Expr->getNumOperands() == 3) {
+          if (const SCEV *S = Map.lookup(
+                  SE.getAddExpr(Expr->getOperand(1), Expr->getOperand(2))))
+            return SE.getAddExpr(Expr->getOperand(0), S);
+        }
+
+        // For expressions of the form (Const + A), check if we have guard info
+        // for (Const + 1 + A), and rewrite to ((Const + 1 + A) - 1). This makes
+        // sure we don't loose information when rewriting expressions based on
+        // back-edge taken counts in some cases..
+        if (Expr->getNumOperands() == 2) {
+          auto *NewC =
+              SE.getAddExpr(Expr->getOperand(0), SE.getOne(Expr->getType()));
+          if (const SCEV *S =
+                  Map.lookup(SE.getAddExpr(NewC, Expr->getOperand(1))))
+            return SE.getMinusSCEV(S, SE.getOne(Expr->getType()));
+        }
       }
+
       SmallVector<const SCEV *, 2> Operands;
       bool Changed = false;
       for (const auto *Op : Expr->operands()) {
diff --git a/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll b/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll
index 6b2c78cebc44a..5ea836d3b8067 100644
--- a/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll
+++ b/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll
@@ -33,9 +33,9 @@ declare void @clobber()
 define void @test_add_sub_1_guard(ptr %src, i32 %n) {
 ; CHECK-LABEL: 'test_add_sub_1_guard'
 ; CHECK-NEXT:  Determining loop execution counts for: @test_add_sub_1_guard
-; CHECK-NEXT:  Loop %loop: backedge-taken count is (zext i32 (-1 + (%n /u 2))<nsw> to i64)
-; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 4294967295
-; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is (zext i32 (-1 + (%n /u 2))<nsw> to i64)
+; CHECK-NEXT:  Loop %loop: backedge-taken count is i64 0
+; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 0
+; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is i64 0
 ; CHECK-NEXT:  Loop %loop: Trip multiple is 1
 ;
 entry:
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
index 8d091a00ed4b9..2f0627b7d4476 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
@@ -61,7 +61,7 @@ define void @umin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
 ; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void umin(unsigned a, unsigned b) {
 ;   a *= 2;
@@ -102,12 +102,12 @@ define void @umax(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:    %cond = select i1 %cmp, i32 %mul, i32 %mul1
 ; CHECK-NEXT:    --> ((2 * %a) umax (4 * %b)) U: [0,-1) S: [-2147483648,2147483647)
 ; CHECK-NEXT:    %i.011 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
-; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,-2147483648) Exits: (-1 + ((2 * %a) umax (4 * %b))) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + ((2 * %a) umax (4 * %b))) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %inc = add nuw nsw i32 %i.011, 1
-; CHECK-NEXT:    --> {1,+,1}<nuw><%for.body> U: [1,-1) S: [1,-1) Exits: ((2 * %a) umax (4 * %b)) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {1,+,1}<nuw><nsw><%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: ((2 * %a) umax (4 * %b)) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @umax
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a) umax (4 * %b)))
-; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 -3
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a) umax (4 * %b)))
 ; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
@@ -157,7 +157,7 @@ define void @smin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
 ; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void smin(signed a, signed b) {
 ;   a *= 2;
@@ -197,12 +197,12 @@ define void @smax(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:    %cond = select i1 %cmp, i32 %mul, i32 %mul1
 ; CHECK-NEXT:    --> ((2 * %a)<nsw> smax (4 * %b)<nsw>) U: [0,-1) S: [-2147483648,2147483647)
 ; CHECK-NEXT:    %i.011 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
-; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,-2147483648) Exits: (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>)) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>)) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %inc = add nuw nsw i32 %i.011, 1
-; CHECK-NEXT:    --> {1,+,1}<nuw><%for.body> U: [1,-1) S: [1,-1) Exits: ((2 * %a)<nsw> smax (4 * %b)<nsw>) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {1,+,1}<nuw><nsw><%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: ((2 * %a)<nsw> smax (4 * %b)<nsw>) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @smax
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>))
-; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 -3
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>))
 ; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
diff --git a/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll b/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll
index 4b52479fc6c4d..40e3c63cbe04a 100644
--- a/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll
+++ b/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll
@@ -343,14 +343,13 @@ define void @slt_no_smax_needed(i64 %n, ptr %dst) {
 ; CHECK-NEXT:    [[PRE:%.*]] = icmp ult i32 [[ADD_1]], 8
 ; CHECK-NEXT:    br i1 [[PRE]], label [[EXIT:%.*]], label [[LOOP_PREHEADER:%.*]]
 ; CHECK:       loop.preheader:
-; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[SHR]], i32 1)
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[LOOP_PREHEADER]] ]
 ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr [[DST:%.*]], i32 [[IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP]], align 1
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SMAX]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SHR]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]]
 ; CHECK:       exit.loopexit:
 ; CHECK-NEXT:    br label [[EXIT]]
@@ -385,14 +384,13 @@ define void @ult_no_umax_needed(i64 %n, ptr %dst) {
 ; CHECK-NEXT:    [[PRE:%.*]] = icmp ult i32 [[ADD_1]], 8
 ; CHECK-NEXT:    br i1 [[PRE]], label [[EXIT:%.*]], label [[LOOP_PREHEADER:%.*]]
 ; CHECK:       loop.preheader:
-; CHECK-NEXT:    [[UMAX:%.*]] = call i32 @llvm.umax.i32(i32 [[SHR]], i32 1)
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[LOOP_PREHEADER]] ]
 ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr [[DST:%.*]], i32 [[IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP]], align 1
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[UMAX]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SHR]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]]
 ; CHECK:       exit.loopexit:
 ; CHECK-NEXT:    br label [[EXIT]]
diff --git a/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll b/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll
index bc1543d8361a7..09419c13aaeb0 100644
--- a/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll
+++ b/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll
@@ -61,9 +61,9 @@ define void @test_memset_size_can_use_info_from_guards(i32 %x, ptr %dst) {
 ; CHECK:       [[LOOP1_BACKEDGE]]:
 ; CHECK-NEXT:    br label %[[LOOP1]]
 ; CHECK:       [[LOOP2_PREHEADER]]:
-; CHECK-NEXT:    [[TMP0:%.*]] = zext i32 [[SUB]] to i64
-; CHECK-NEXT:    [[TMP1:%.*]] = lshr i64 [[TMP0]], 1
-; CHECK-NEXT:    [[UMAX:%.*]] = call i64 @llvm.umax.i64(i64 [[TMP1]], i64 1)
+; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i32 [[SHR]], -1
+; CHECK-NEXT:    [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
+; CHECK-NEXT:    [[UMAX:%.*]] = add nuw nsw i64 [[TMP1]], 1
 ; CHECK-NEXT:    call void @llvm.memset.p0.i64(ptr align 1 [[DST]], i8 0, i64 [[UMAX]], i1 false)
 ; CHECK-NEXT:    br label %[[LOOP2:.*]]
 ; CHECK:       [[LOOP2]]:
diff --git a/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll b/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll
index 4f5a26e9c89cb..4a9b2bd7cc888 100644
--- a/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll
+++ b/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll
@@ -193,7 +193,7 @@ define dso_local void @cannotProveAlignedTC(ptr noalias nocapture %A, i32 %p, i3
 ; CHECK-NEXT:    store i32 13, ptr [[TMP12]], align 1
 ; CHECK-NEXT:    br label [[PRED_STORE_CONTINUE6]]
 ; CHECK:       pred.store.continue6:
-; CHECK-NEXT:    [[INDEX_NEXT]] = add i32 [[INDEX]], 4
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
 ; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
 ; CHECK-NEXT:    [[TMP13:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
 ; CHECK-NEXT:    br i1 [[TMP13]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
diff --git a/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll b/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
index 648ebc7e6c3a5..a556b15adbefc 100644
--- a/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
+++ b/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
@@ -465,12 +465,7 @@ define void @remove_diff_checks_via_guards(i32 %x, i32 %y, ptr %A) {
 ; CHECK-NEXT:    [[TMP13:%.*]] = icmp ugt i64 [[SMAX]], 4294967295
 ; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP12]], [[TMP13]]
 ; CHECK-NEXT:    [[TMP15:%.*]] = or i1 [[TMP9]], [[TMP14]]
-; CHECK-NEXT:    br i1 [[TMP15]], [[SCALAR_PH]], label %[[VECTOR_MEMCHECK:.*]]
-; CHECK:       [[VECTOR_MEMCHECK]]:
-; CHECK-NEXT:    [[TMP16:%.*]] = sext i32 [[OFFSET]] to i64
-; CHECK-NEXT:    [[TMP17:%.*]] = shl nsw i64 [[TMP16]], 2
-; CHECK-NEXT:    [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP17]], 16
-; CHECK-NEXT:    br i1 [[DIFF_CHECK]], [[SCALAR_PH]], [[VECTOR_PH1:label %.*]]
+; CHECK-NEXT:    br i1 [[TMP15]], [[SCALAR_PH]], [[VECTOR_PH:label %.*]]
 ;
 entry:
   %offset = sub i32 %x, %y

llvmbot · 2025-09-22T10:39:50Z

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

When re-writing SCEVAddExprs to apply information from guards, check if
we have information for the expression itself. If so, apply it.

When we have an expression of the form (Const + A), check if we have
have guard info for (Const + 1 + A) and use it. This is needed to avoid
regressions in a few cases, where we have BTCs with a subtracted
constant.

Rewriting expressions could cause regressions, e.g. when comparing 2
SCEV expressions where we are only able to rewrite one side, but I could
not find any cases where this happens more with this patch in practice.

Depends on #160012 (included in
PR)

Proofs for some of the test changes: https://alive2.llvm.org/ce/z/RPX6t_

Full diff: https://github.com/llvm/llvm-project/pull/159942.diff

7 Files Affected:

(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+47-11)
(modified) llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll (+3-3)
(modified) llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll (+8-8)
(modified) llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll (+2-4)
(modified) llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll (+3-3)
(modified) llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll (+1-1)
(modified) llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll (+1-6)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index b08399b381f34..09a31b105e128 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -15850,12 +15850,17 @@ void ScalarEvolution::LoopGuards::collectFromBlock(
         To = SE.getUMaxExpr(FromRewritten, RHS);
         if (auto *UMin = dyn_cast<SCEVUMinExpr>(FromRewritten))
           EnqueueOperands(UMin);
+        if (RHS->isOne())
+          ExprsToRewrite.push_back(From);
         break;
       case CmpInst::ICMP_SGT:
       case CmpInst::ICMP_SGE:
         To = SE.getSMaxExpr(FromRewritten, RHS);
-        if (auto *SMin = dyn_cast<SCEVSMinExpr>(FromRewritten))
+        if (auto *SMin = dyn_cast<SCEVSMinExpr>(FromRewritten)) {
           EnqueueOperands(SMin);
+        }
+        if (RHS->isOne())
+          ExprsToRewrite.push_back(From);
         break;
       case CmpInst::ICMP_EQ:
         if (isa<SCEVConstant>(RHS))
@@ -15986,7 +15991,22 @@ void ScalarEvolution::LoopGuards::collectFromBlock(
     for (const SCEV *Expr : ExprsToRewrite) {
       const SCEV *RewriteTo = Guards.RewriteMap[Expr];
       Guards.RewriteMap.erase(Expr);
-      Guards.RewriteMap.insert({Expr, Guards.rewrite(RewriteTo)});
+      const SCEV *Rewritten = Guards.rewrite(RewriteTo);
+
+      // Try to strengthen divisibility of SMax/UMax expressions coming from >=
+      // 1 conditions.
+      if (auto *SMax = dyn_cast<SCEVSMaxExpr>(Rewritten)) {
+        unsigned MinTrailingZeros = SE.getMinTrailingZeros(SMax->getOperand(1));
+        for (const SCEV *Op : drop_begin(SMax->operands(), 2))
+          MinTrailingZeros =
+              std::min(MinTrailingZeros, SE.getMinTrailingZeros(Op));
+        if (MinTrailingZeros != 0)
+          Rewritten = SE.getSMaxExpr(
+              SE.getConstant(APInt(SMax->getType()->getScalarSizeInBits(), 1)
+                                 .shl(MinTrailingZeros)),
+              SMax);
+      }
+      Guards.RewriteMap.insert({Expr, Rewritten});
     }
   }
 }
@@ -16059,16 +16079,32 @@ const SCEV *ScalarEvolution::LoopGuards::rewrite(const SCEV *Expr) const {
     }
 
     const SCEV *visitAddExpr(const SCEVAddExpr *Expr) {
-      // Trip count expressions sometimes consist of adding 3 operands, i.e.
-      // (Const + A + B). There may be guard info for A + B, and if so, apply
-      // it.
-      // TODO: Could more generally apply guards to Add sub-expressions.
-      if (isa<SCEVConstant>(Expr->getOperand(0)) &&
-          Expr->getNumOperands() == 3) {
-        if (const SCEV *S = Map.lookup(
-                SE.getAddExpr(Expr->getOperand(1), Expr->getOperand(2))))
-          return SE.getAddExpr(Expr->getOperand(0), S);
+      if (const SCEV *S = Map.lookup(Expr))
+        return S;
+      if (isa<SCEVConstant>(Expr->getOperand(0))) {
+        // Trip count expressions sometimes consist of adding 3 operands, i.e.
+        // (Const + A + B). There may be guard info for A + B, and if so, apply
+        // it.
+        // TODO: Could more generally apply guards to Add sub-expressions.
+        if (Expr->getNumOperands() == 3) {
+          if (const SCEV *S = Map.lookup(
+                  SE.getAddExpr(Expr->getOperand(1), Expr->getOperand(2))))
+            return SE.getAddExpr(Expr->getOperand(0), S);
+        }
+
+        // For expressions of the form (Const + A), check if we have guard info
+        // for (Const + 1 + A), and rewrite to ((Const + 1 + A) - 1). This makes
+        // sure we don't loose information when rewriting expressions based on
+        // back-edge taken counts in some cases..
+        if (Expr->getNumOperands() == 2) {
+          auto *NewC =
+              SE.getAddExpr(Expr->getOperand(0), SE.getOne(Expr->getType()));
+          if (const SCEV *S =
+                  Map.lookup(SE.getAddExpr(NewC, Expr->getOperand(1))))
+            return SE.getMinusSCEV(S, SE.getOne(Expr->getType()));
+        }
       }
+
       SmallVector<const SCEV *, 2> Operands;
       bool Changed = false;
       for (const auto *Op : Expr->operands()) {
diff --git a/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll b/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll
index 6b2c78cebc44a..5ea836d3b8067 100644
--- a/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll
+++ b/llvm/test/Analysis/ScalarEvolution/backedge-taken-count-guard-info-apply-to-adds.ll
@@ -33,9 +33,9 @@ declare void @clobber()
 define void @test_add_sub_1_guard(ptr %src, i32 %n) {
 ; CHECK-LABEL: 'test_add_sub_1_guard'
 ; CHECK-NEXT:  Determining loop execution counts for: @test_add_sub_1_guard
-; CHECK-NEXT:  Loop %loop: backedge-taken count is (zext i32 (-1 + (%n /u 2))<nsw> to i64)
-; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 4294967295
-; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is (zext i32 (-1 + (%n /u 2))<nsw> to i64)
+; CHECK-NEXT:  Loop %loop: backedge-taken count is i64 0
+; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 0
+; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is i64 0
 ; CHECK-NEXT:  Loop %loop: Trip multiple is 1
 ;
 entry:
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
index 8d091a00ed4b9..2f0627b7d4476 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
@@ -61,7 +61,7 @@ define void @umin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
 ; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void umin(unsigned a, unsigned b) {
 ;   a *= 2;
@@ -102,12 +102,12 @@ define void @umax(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:    %cond = select i1 %cmp, i32 %mul, i32 %mul1
 ; CHECK-NEXT:    --> ((2 * %a) umax (4 * %b)) U: [0,-1) S: [-2147483648,2147483647)
 ; CHECK-NEXT:    %i.011 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
-; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,-2147483648) Exits: (-1 + ((2 * %a) umax (4 * %b))) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + ((2 * %a) umax (4 * %b))) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %inc = add nuw nsw i32 %i.011, 1
-; CHECK-NEXT:    --> {1,+,1}<nuw><%for.body> U: [1,-1) S: [1,-1) Exits: ((2 * %a) umax (4 * %b)) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {1,+,1}<nuw><nsw><%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: ((2 * %a) umax (4 * %b)) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @umax
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a) umax (4 * %b)))
-; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 -3
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a) umax (4 * %b)))
 ; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
@@ -157,7 +157,7 @@ define void @smin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
 ; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void smin(signed a, signed b) {
 ;   a *= 2;
@@ -197,12 +197,12 @@ define void @smax(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:    %cond = select i1 %cmp, i32 %mul, i32 %mul1
 ; CHECK-NEXT:    --> ((2 * %a)<nsw> smax (4 * %b)<nsw>) U: [0,-1) S: [-2147483648,2147483647)
 ; CHECK-NEXT:    %i.011 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
-; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,-2147483648) Exits: (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>)) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>)) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %inc = add nuw nsw i32 %i.011, 1
-; CHECK-NEXT:    --> {1,+,1}<nuw><%for.body> U: [1,-1) S: [1,-1) Exits: ((2 * %a)<nsw> smax (4 * %b)<nsw>) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {1,+,1}<nuw><nsw><%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: ((2 * %a)<nsw> smax (4 * %b)<nsw>) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @smax
 ; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>))
-; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 -3
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483646
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a)<nsw> smax (4 * %b)<nsw>))
 ; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
diff --git a/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll b/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll
index 4b52479fc6c4d..40e3c63cbe04a 100644
--- a/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll
+++ b/llvm/test/Transforms/IndVarSimplify/canonicalize-cmp.ll
@@ -343,14 +343,13 @@ define void @slt_no_smax_needed(i64 %n, ptr %dst) {
 ; CHECK-NEXT:    [[PRE:%.*]] = icmp ult i32 [[ADD_1]], 8
 ; CHECK-NEXT:    br i1 [[PRE]], label [[EXIT:%.*]], label [[LOOP_PREHEADER:%.*]]
 ; CHECK:       loop.preheader:
-; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[SHR]], i32 1)
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[LOOP_PREHEADER]] ]
 ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr [[DST:%.*]], i32 [[IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP]], align 1
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SMAX]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SHR]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]]
 ; CHECK:       exit.loopexit:
 ; CHECK-NEXT:    br label [[EXIT]]
@@ -385,14 +384,13 @@ define void @ult_no_umax_needed(i64 %n, ptr %dst) {
 ; CHECK-NEXT:    [[PRE:%.*]] = icmp ult i32 [[ADD_1]], 8
 ; CHECK-NEXT:    br i1 [[PRE]], label [[EXIT:%.*]], label [[LOOP_PREHEADER:%.*]]
 ; CHECK:       loop.preheader:
-; CHECK-NEXT:    [[UMAX:%.*]] = call i32 @llvm.umax.i32(i32 [[SHR]], i32 1)
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[LOOP_PREHEADER]] ]
 ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i8, ptr [[DST:%.*]], i32 [[IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP]], align 1
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[UMAX]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SHR]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT_LOOPEXIT:%.*]]
 ; CHECK:       exit.loopexit:
 ; CHECK-NEXT:    br label [[EXIT]]
diff --git a/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll b/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll
index bc1543d8361a7..09419c13aaeb0 100644
--- a/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll
+++ b/llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll
@@ -61,9 +61,9 @@ define void @test_memset_size_can_use_info_from_guards(i32 %x, ptr %dst) {
 ; CHECK:       [[LOOP1_BACKEDGE]]:
 ; CHECK-NEXT:    br label %[[LOOP1]]
 ; CHECK:       [[LOOP2_PREHEADER]]:
-; CHECK-NEXT:    [[TMP0:%.*]] = zext i32 [[SUB]] to i64
-; CHECK-NEXT:    [[TMP1:%.*]] = lshr i64 [[TMP0]], 1
-; CHECK-NEXT:    [[UMAX:%.*]] = call i64 @llvm.umax.i64(i64 [[TMP1]], i64 1)
+; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i32 [[SHR]], -1
+; CHECK-NEXT:    [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
+; CHECK-NEXT:    [[UMAX:%.*]] = add nuw nsw i64 [[TMP1]], 1
 ; CHECK-NEXT:    call void @llvm.memset.p0.i64(ptr align 1 [[DST]], i8 0, i64 [[UMAX]], i1 false)
 ; CHECK-NEXT:    br label %[[LOOP2:.*]]
 ; CHECK:       [[LOOP2]]:
diff --git a/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll b/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll
index 4f5a26e9c89cb..4a9b2bd7cc888 100644
--- a/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll
+++ b/llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-divisible-TC.ll
@@ -193,7 +193,7 @@ define dso_local void @cannotProveAlignedTC(ptr noalias nocapture %A, i32 %p, i3
 ; CHECK-NEXT:    store i32 13, ptr [[TMP12]], align 1
 ; CHECK-NEXT:    br label [[PRED_STORE_CONTINUE6]]
 ; CHECK:       pred.store.continue6:
-; CHECK-NEXT:    [[INDEX_NEXT]] = add i32 [[INDEX]], 4
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
 ; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
 ; CHECK-NEXT:    [[TMP13:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
 ; CHECK-NEXT:    br i1 [[TMP13]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
diff --git a/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll b/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
index 648ebc7e6c3a5..a556b15adbefc 100644
--- a/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
+++ b/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
@@ -465,12 +465,7 @@ define void @remove_diff_checks_via_guards(i32 %x, i32 %y, ptr %A) {
 ; CHECK-NEXT:    [[TMP13:%.*]] = icmp ugt i64 [[SMAX]], 4294967295
 ; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP12]], [[TMP13]]
 ; CHECK-NEXT:    [[TMP15:%.*]] = or i1 [[TMP9]], [[TMP14]]
-; CHECK-NEXT:    br i1 [[TMP15]], [[SCALAR_PH]], label %[[VECTOR_MEMCHECK:.*]]
-; CHECK:       [[VECTOR_MEMCHECK]]:
-; CHECK-NEXT:    [[TMP16:%.*]] = sext i32 [[OFFSET]] to i64
-; CHECK-NEXT:    [[TMP17:%.*]] = shl nsw i64 [[TMP16]], 2
-; CHECK-NEXT:    [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP17]], 16
-; CHECK-NEXT:    br i1 [[DIFF_CHECK]], [[SCALAR_PH]], [[VECTOR_PH1:label %.*]]
+; CHECK-NEXT:    br i1 [[TMP15]], [[SCALAR_PH]], [[VECTOR_PH:label %.*]]
 ;
 entry:
   %offset = sub i32 %x, %y

fhahn

This depends on #160012 to prevent a regression where we loose divisibility info by the rewrite

fhahn · 2025-09-22T10:41:47Z

llvm/test/Transforms/LoopIdiom/add-nsw-zext-fold.ll

+; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i32 [[SHR]], -1
+; CHECK-NEXT:    [[TMP1:%.*]] = zext i32 [[TMP0]] to i64
+; CHECK-NEXT:    [[UMAX:%.*]] = add nuw nsw i64 [[TMP1]], 1


This is missing a fold end-to-end from https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2845/files/ed058c0c689f0dadf267033c74e0960c3593baff#r2367574398

https://alive2.llvm.org/ce/z/BuHtNm

When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * llvm#160012 * llvm#159942 https://alive2.llvm.org/ce/z/YyBvoT

nikic · 2025-10-07T15:27:41Z

Can you please add some more detail on why this specific pattern is relevant? Based on tests it clearly is, I'm just not super clear on why.

nikic · 2025-10-07T15:28:12Z

llvm/lib/Analysis/ScalarEvolution.cpp

+              SE.getAddExpr(Expr->getOperand(0), SE.getOne(Expr->getType()));
+          if (const SCEV *S =
+                  Map.lookup(SE.getAddExpr(NewC, Expr->getOperand(1))))
+            return SE.getMinusSCEV(S, SE.getOne(Expr->getType()));


Can directly use add with -1 here and avoid an extra canonicalization step.

Done thanks

When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * llvm#160012 * llvm#159942 https://alive2.llvm.org/ce/z/YyBvoT

fhahn

Can you please add some more detail on why this specific pattern is relevant? Based on tests it clearly is, I'm just not super clear on why.

The general case where this can help is if we have a guard for an SCEVAddExpr (e.g x - y > 0). Previously we would only re-write the operands of SCEVAddExprs, and would not apply the guard info.

The logic with (Const + 1 + A) -> ((Const + 1 + A) - 1) is needed to avoid regressions when comparing expressions with guards applied (like BTC/exit count expressions or the value at the last iteration vs value at 2n-to-last iteration). With the current changes, we may apply info to only one of the those expressions, which can lead to pessimizations. One example is if we have (n - 2 != n - 1) and we have {n - 1: umin(1, n-1)}, then we would only rewrite (n -1) -> umin(1, n-1) and fail to simplify the compare to true.

nikic · 2025-10-09T12:41:24Z

llvm/lib/Analysis/ScalarEvolution.cpp

+        // For expressions of the form (Const + A), check if we have guard info
+        // for (Const + 1 + A), and rewrite to ((Const + 1 + A) - 1). This makes
+        // sure we don't loose information when rewriting expressions based on
+        // back-edge taken counts in some cases..


Suggested change

// back-edge taken counts in some cases..

// back-edge taken counts in some cases.

Done thanks

nikic · 2025-10-09T12:45:12Z

llvm/test/Transforms/LoopUnroll/runtime-unroll-assume-no-remainder.ll

 ; RUN: opt < %s -S -passes=loop-unroll -unroll-runtime=true -unroll-runtime-epilog=true -unroll-count=2 | FileCheck %s

 ; Make sure the loop is unrolled without a remainder loop based on an assumption
 ; that the least significant bit is known to be zero.


Aren't we failing the intent of this test now? We now produce an epilogue loop.

The changes here actually was a regression, as we failed to determine the expected multple of the trip count from the guards. To fix that, I think we need to improve the divisibility handling in general: #163021 (included in this PR now)

github-actions · 2025-10-11T21:28:04Z

✅ With the latest revision this PR passed the C/C++ code formatter.

When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * llvm#160012 * llvm#159942 https://alive2.llvm.org/ce/z/YyBvoT

When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * #160012 * #159942 https://alive2.llvm.org/ce/z/YyBvoT PR: #160500

…500) When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * llvm/llvm-project#160012 * llvm/llvm-project#159942 https://alive2.llvm.org/ce/z/YyBvoT PR: llvm/llvm-project#160500

When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * llvm#160012 * llvm#159942 https://alive2.llvm.org/ce/z/YyBvoT PR: llvm#160500

fhahn · 2025-11-03T20:49:27Z

ping :)

rebased on top of main after landing #163250

When collecting information from loop guards, use UMax(1, %b - %a) for ICMP NE %a, %b, if neither are constant. This improves results in some cases, and will be even more useful together with * llvm#160012 * llvm#159942 https://alive2.llvm.org/ce/z/YyBvoT PR: llvm#160500 (cherry picked from commit 2d02726)

nikic

LGTM assuming no interesting compile-time impact.

When re-writing SCEVAddExprs to apply information from guards, check if we have information for the expression itself. If so, apply it. When we have an expression of the form (Const + A), check if we have have guard info for (Const + 1 + A) and use it. This is needed to avoid regressions in a few cases, where we have BTCs with a subtracted constant. Rewriting expressions could cause regressions, e.g. when comparing 2 SCEV expressions where we are only able to rewrite one side, but I could not find any cases where this happens more with this patch in practice. Depends on llvm#160012 (included in PR) Proofs for some of the test changes: https://alive2.llvm.org/ce/z/RPX6t_

fhahn mentioned this pull request Sep 20, 2025

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

zyw-bot mentioned this pull request Sep 20, 2025

pre-commit: PR159942 dtcxzyw/llvm-opt-benchmark#2843

Closed

fhahn force-pushed the scev-rewrite-add-expr branch 2 times, most recently from 74447bb to 237861f Compare September 20, 2025 20:27

zyw-bot mentioned this pull request Sep 20, 2025

pre-commit: PR159942 dtcxzyw/llvm-opt-benchmark#2844

Closed

fhahn force-pushed the scev-rewrite-add-expr branch from 237861f to fd7152b Compare September 21, 2025 20:33

zyw-bot mentioned this pull request Sep 21, 2025

pre-commit: PR159942 dtcxzyw/llvm-opt-benchmark#2845

Open

fhahn force-pushed the scev-rewrite-add-expr branch from fd7152b to 8d4a5e6 Compare September 22, 2025 10:34

fhahn marked this pull request as ready for review September 22, 2025 10:39

fhahn requested a review from nikic as a code owner September 22, 2025 10:39

llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Sep 22, 2025

fhahn requested review from efriedma-quic and preames September 22, 2025 10:39

fhahn commented Sep 22, 2025

View reviewed changes

fhahn mentioned this pull request Sep 24, 2025

[SCEV] Collect guard info for ICMP NE w/o constants. #160500

Merged

nikic reviewed Oct 7, 2025

View reviewed changes

fhahn force-pushed the scev-rewrite-add-expr branch from 8d4a5e6 to 5180e68 Compare October 9, 2025 10:27

fhahn commented Oct 9, 2025

View reviewed changes

nikic reviewed Oct 9, 2025

View reviewed changes

fhahn force-pushed the scev-rewrite-add-expr branch from 5180e68 to 0710344 Compare October 11, 2025 21:25

zyw-bot mentioned this pull request Oct 11, 2025

pre-commit: PR159942 dtcxzyw/llvm-opt-benchmark#2922

Open

fhahn mentioned this pull request Oct 13, 2025

[SCEV] Preserve divisor info when adding guard info for ICMP_NE via Sub. #163250

Merged

fhahn force-pushed the scev-rewrite-add-expr branch from 0710344 to 3a98db6 Compare November 2, 2025 15:26

nikic approved these changes Dec 1, 2025

View reviewed changes

fhahn added 4 commits December 1, 2025 16:46

!fixup getMinsSCEV -> getAddExpr + getMinusOne.

5a10df7

!fixup undo test changes, remove stray period.

6e051e6

!fixup update after rebase

9a80348

fhahn force-pushed the scev-rewrite-add-expr branch from 3a98db6 to 9a80348 Compare December 1, 2025 16:56

zyw-bot mentioned this pull request Dec 1, 2025

pre-commit: PR159942 dtcxzyw/llvm-opt-benchmark#3098

Open

	// back-edge taken counts in some cases..
	// back-edge taken counts in some cases.

[SCEV] Rewrite more SCEVAddExpr when applying guards. #159942

Are you sure you want to change the base?

[SCEV] Rewrite more SCEVAddExpr when applying guards. #159942

Conversation

fhahn commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 22, 2025

Uh oh!

llvmbot commented Sep 22, 2025

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

nikic commented Oct 7, 2025

Uh oh!

nikic Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

nikic Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

nikic Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fhahn commented Nov 3, 2025

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fhahn commented Sep 20, 2025 •

edited

Loading

github-actions bot commented Oct 11, 2025 •

edited

Loading