[SimplifyIndVar] ICMP predicate conversion to EQ/NE #144945

SergeyShch01 · 2025-06-19T19:04:15Z

Changes to convert ICMP predicate to EQ/NE when possible to facilitate better work of OSR (which makes limited changes to the IVUse formula for other predicates). Regression tests are updated accordingly.

github-actions · 2025-06-19T19:04:32Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-06-19T19:04:58Z

@llvm/pr-subscribers-pgo
@llvm/pr-subscribers-backend-systemz

@llvm/pr-subscribers-llvm-transforms

Author: Sergey Shcherbinin (SergeyShch01)

Changes

Changes to convert ICMP predicate to EQ/NE when possible to facilitate better work of OSR (which makes limited changes to the IVUse formula for other predicates). Regression tests are updated accordingly.

Patch is 27.44 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144945.diff

10 Files Affected:

(modified) llvm/lib/Transforms/Utils/SimplifyIndVar.cpp (+148-15)
(modified) llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll (+1-1)
(modified) llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll (+4-5)
(modified) llvm/test/Transforms/IndVarSimplify/X86/pr24356.ll (+33-4)
(modified) llvm/test/Transforms/IndVarSimplify/ada-loops.ll (+1-1)
(modified) llvm/test/Transforms/PGOProfile/Inputs/thinlto_cs.proftext (+12-40)
(modified) llvm/test/Transforms/PGOProfile/thinlto_cspgo_use.ll (+2-2)
(modified) llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll (+1-1)
(modified) llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll (+6-6)
(modified) llvm/test/Transforms/PhaseOrdering/X86/hoist-load-of-baseptr.ll (+14-14)

diff --git a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
index 43264cce73719..8fb310a71fa78 100644
--- a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
@@ -97,6 +97,7 @@ namespace {
     bool eliminateIVUser(Instruction *UseInst, Instruction *IVOperand);
     bool makeIVComparisonInvariant(ICmpInst *ICmp, Instruction *IVOperand);
     void eliminateIVComparison(ICmpInst *ICmp, Instruction *IVOperand);
+    bool forceEqualityForICmp(ICmpInst *ICmp, Instruction *IVOperand);
     void simplifyIVRemainder(BinaryOperator *Rem, Instruction *IVOperand,
                              bool IsSigned);
     void replaceRemWithNumerator(BinaryOperator *Rem);
@@ -244,6 +245,128 @@ bool SimplifyIndvar::makeIVComparisonInvariant(ICmpInst *ICmp,
   return true;
 }
 
+/// Try to change predicate of ICmp to EQ/NE to facilitate better work of OSR.
+/// This can be done only if all possible IV values but one lead to the same
+/// produced comparison result, while the 'chosen one' value gives the opposite
+/// result.
+bool SimplifyIndvar::forceEqualityForICmp(ICmpInst *ICmp,
+                                          Instruction *IVOperand) {
+  if (ICmp->isEquality()) {
+    // nothing to do
+    return false;
+  }
+
+  unsigned BoundOperandIdx = IVOperand == ICmp->getOperand(0) ? 1 : 0;
+  const SCEV *BoundSCEV = SE->getSCEV(ICmp->getOperand(BoundOperandIdx));
+  const SCEVConstant *BoundC = dyn_cast<SCEVConstant>(BoundSCEV);
+  CmpInst::Predicate OrigPredicate = ICmp->getPredicate();
+  CmpInst::Predicate NewPredicate = CmpInst::BAD_ICMP_PREDICATE;
+  Type *Ty = IVOperand->getType();
+  APInt NewBoundA;
+
+  if (BoundC) {
+    // Try to find the 'chosen one' value basing on predicate type and bound
+    const APInt &BoundA = BoundC->getAPInt();
+    ConstantRange ExactCR =
+        ConstantRange::makeExactICmpRegion(OrigPredicate, BoundA);
+    if (!ExactCR.getEquivalentICmp(NewPredicate, NewBoundA)) {
+      NewPredicate = CmpInst::BAD_ICMP_PREDICATE;
+    }
+  }
+
+  if (!ICmpInst::isEquality(NewPredicate)) {
+    const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SE->getSCEV(IVOperand));
+    if (!AR) {
+      return false;
+    }
+    const SCEVConstant *IVStart = dyn_cast<SCEVConstant>(AR->getStart());
+    const SCEVConstant *IVStep =
+        dyn_cast<SCEVConstant>(AR->getStepRecurrence(*SE));
+    if (!IVStart || !IVStep || !IVStep->getValue()->getValue()) {
+      return false;
+    }
+
+    if (BoundC) {
+      // Check to see the 'chosen one' value is the IV start value
+      bool HasNoWrap = ICmpInst::isSigned(OrigPredicate)
+                           ? AR->hasNoSignedWrap()
+                           : AR->hasNoUnsignedWrap();
+      if (HasNoWrap) {
+        const DataLayout &DL = ICmp->getParent()->getDataLayout();
+        Constant *SecondIterIV =
+            ConstantInt::get(Ty, IVStart->getAPInt() + IVStep->getAPInt());
+        Constant *FirstIterResult = ConstantFoldCompareInstOperands(
+            OrigPredicate, IVStart->getValue(), BoundC->getValue(), DL);
+        Constant *SecondIterResult = ConstantFoldCompareInstOperands(
+            OrigPredicate, SecondIterIV, BoundC->getValue(), DL);
+        if (FirstIterResult != SecondIterResult) {
+          NewBoundA = IVStart->getAPInt();
+          NewPredicate = FirstIterResult->isAllOnesValue() ? CmpInst::ICMP_EQ
+                                                           : CmpInst::ICMP_NE;
+        }
+      }
+    }
+
+    if (!ICmpInst::isEquality(NewPredicate)) {
+      // Check to see the 'chosen one' value is the very last IV value.
+      // To put it differently, check to see if ICmp directly or indirectly
+      // defines maximum loop trip count (or simply has aligned behavior by
+      // accident). This is different from loop exit condition rewriting as here
+      // not only ICmp instructions directly writing to exiting branch are
+      // considered.
+
+      // check to see if max trip count and IV parameters are constant
+      const SCEVConstant *MaxBackCount =
+          dyn_cast<SCEVConstant>(SE->getConstantMaxBackedgeTakenCount(L));
+      if (!MaxBackCount) {
+        return false;
+      }
+
+      // compute the number of consecutive iterations in which produced
+      // predicate value will be the same
+      bool ExitIfTrue = false;
+      auto EL = SE->computeExitLimitFromCond(L, ICmp, ExitIfTrue, false);
+      const SCEVConstant *SameIterCount =
+          dyn_cast<SCEVConstant>(EL.ExactNotTaken);
+      if (!SameIterCount || SameIterCount->getValue()->isZero()) {
+        ExitIfTrue = !ExitIfTrue;
+        EL = SE->computeExitLimitFromCond(L, ICmp, ExitIfTrue, false);
+        SameIterCount = dyn_cast<SCEVConstant>(EL.ExactNotTaken);
+      }
+
+      if (SameIterCount != MaxBackCount) {
+        // ICmp isn't aligned with maximum trip count
+        return false;
+      }
+
+      unsigned IVBitWigth = IVStep->getAPInt().getBitWidth();
+      unsigned CountBitWigth = SameIterCount->getAPInt().getBitWidth();
+      APInt SameIterCountA = SameIterCount->getAPInt();
+      if (IVBitWigth < CountBitWigth) {
+        SameIterCountA = SameIterCountA.trunc(IVBitWigth);
+      } else if (IVBitWigth > CountBitWigth) {
+        SameIterCountA = SameIterCountA.zext(IVBitWigth);
+      }
+      NewBoundA = IVStart->getAPInt() + (IVStep->getAPInt() * SameIterCountA);
+      NewPredicate = ExitIfTrue ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE;
+    }
+  }
+
+  if (!TTI->isLegalICmpImmediate(NewBoundA.getSExtValue())) {
+    return false;
+  }
+
+  LLVM_DEBUG(dbgs() << "INDVARS: Force EQ/NE predicate for max trip count: "
+                    << *ICmp << '\n');
+
+  assert(Ty->getPrimitiveSizeInBits() == NewBoundA.getBitWidth() &&
+         "bit widths should be aligned");
+  ICmp->setOperand(BoundOperandIdx, ConstantInt::get(Ty, NewBoundA));
+  ICmp->setPredicate(NewPredicate);
+
+  return true;
+}
+
 /// SimplifyIVUsers helper for eliminating useless
 /// comparisons against an induction variable.
 void SimplifyIndvar::eliminateIVComparison(ICmpInst *ICmp,
@@ -267,6 +390,7 @@ void SimplifyIndvar::eliminateIVComparison(ICmpInst *ICmp,
   // If the condition is always true or always false in the given context,
   // replace it with a constant value.
   SmallVector<Instruction *, 4> Users;
+  bool IsDead = false;
   for (auto *U : ICmp->users())
     Users.push_back(cast<Instruction>(U));
   const Instruction *CtxI = findCommonDominator(Users, *DT);
@@ -274,26 +398,35 @@ void SimplifyIndvar::eliminateIVComparison(ICmpInst *ICmp,
     SE->forgetValue(ICmp);
     ICmp->replaceAllUsesWith(ConstantInt::getBool(ICmp->getContext(), *Ev));
     DeadInsts.emplace_back(ICmp);
+    IsDead = true;
     LLVM_DEBUG(dbgs() << "INDVARS: Eliminated comparison: " << *ICmp << '\n');
   } else if (makeIVComparisonInvariant(ICmp, IVOperand)) {
-    // fallthrough to end of function
-  } else if (ICmpInst::isSigned(OriginalPred) &&
-             SE->isKnownNonNegative(S) && SE->isKnownNonNegative(X)) {
+    IsDead = true;
+  } else {
     // If we were unable to make anything above, all we can is to canonicalize
     // the comparison hoping that it will open the doors for other
-    // optimizations. If we find out that we compare two non-negative values,
-    // we turn the instruction's predicate to its unsigned version. Note that
-    // we cannot rely on Pred here unless we check if we have swapped it.
-    assert(ICmp->getPredicate() == OriginalPred && "Predicate changed?");
-    LLVM_DEBUG(dbgs() << "INDVARS: Turn to unsigned comparison: " << *ICmp
-                      << '\n');
-    ICmp->setPredicate(ICmpInst::getUnsignedPredicate(OriginalPred));
-    ICmp->setSameSign();
-  } else
-    return;
+    // optimizations.
+    if (ICmpInst::isSigned(OriginalPred) && SE->isKnownNonNegative(S) &&
+        SE->isKnownNonNegative(X)) {
+      // If we find out that we compare two non-negative values,
+      // we turn the instruction's predicate to its unsigned version. Note that
+      // we cannot rely on Pred here unless we check if we have swapped it.
+      assert(ICmp->getPredicate() == OriginalPred && "Predicate changed?");
+      LLVM_DEBUG(dbgs() << "INDVARS: Turn to unsigned comparison: " << *ICmp
+                        << '\n');
+      ICmp->setPredicate(ICmpInst::getUnsignedPredicate(OriginalPred));
+      ICmp->setSameSign();
+      Changed = true;
+    }
+    if (forceEqualityForICmp(ICmp, IVOperand)) {
+      Changed = true;
+    }
+  }
 
-  ++NumElimCmp;
-  Changed = true;
+  if (IsDead) {
+    NumElimCmp++;
+    Changed = true;
+  }
 }
 
 bool SimplifyIndvar::eliminateSDiv(BinaryOperator *SDiv) {
diff --git a/llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll b/llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll
index 409622c255ea0..af0128af5b0c8 100644
--- a/llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll
+++ b/llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll
@@ -13,7 +13,7 @@ define i32 @guards_applied_to_add_rec(ptr %dst) {
 ; CHECK-NEXT:    [[OUTER_IV_0:%.*]] = phi i32 [ 2, %[[ENTRY]] ], [ [[OUTER_IV_0_NEXT:%.*]], %[[OUTER_LATCH:.*]] ]
 ; CHECK-NEXT:    [[OUTER_IV_1:%.*]] = phi i32 [ 1, %[[ENTRY]] ], [ [[OUTER_IV_0]], %[[OUTER_LATCH]] ]
 ; CHECK-NEXT:    [[SHR28:%.*]] = lshr i32 [[OUTER_IV_1]], 1
-; CHECK-NEXT:    [[PRE:%.*]] = icmp samesign ult i32 [[OUTER_IV_1]], 2
+; CHECK-NEXT:    [[PRE:%.*]] = icmp samesign eq i32 [[OUTER_IV_1]], 1
 ; CHECK-NEXT:    br i1 [[PRE]], label %[[OUTER_LATCH]], label %[[INNER_PREHEADER:.*]]
 ; CHECK:       [[INNER_PREHEADER]]:
 ; CHECK-NEXT:    [[TMP0:%.*]] = zext i32 [[SHR28]] to i64
diff --git a/llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll b/llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll
index 257816650017a..4e280c628a6a5 100644
--- a/llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll
+++ b/llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll
@@ -97,7 +97,7 @@ define void @test2(ptr %a, ptr %b, i8 %limit, i1 %arg) {
 ; CHECK-LABEL: @test2(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[CONV:%.*]] = zext i8 [[LIMIT:%.*]] to i32
-; CHECK-NEXT:    br i1 %arg, label [[FOR_COND1_PREHEADER_PREHEADER:%.*]], label [[FOR_COND1_PREHEADER_US_PREHEADER:%.*]]
+; CHECK-NEXT:    br i1 [[ARG:%.*]], label [[FOR_COND1_PREHEADER_PREHEADER:%.*]], label [[FOR_COND1_PREHEADER_US_PREHEADER:%.*]]
 ; CHECK:       for.cond1.preheader.us.preheader:
 ; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[CONV]], i32 1)
 ; CHECK-NEXT:    br label [[FOR_COND1_PREHEADER_US:%.*]]
@@ -110,7 +110,7 @@ define void @test2(ptr %a, ptr %b, i8 %limit, i1 %arg) {
 ; CHECK-NEXT:    br label [[FOR_INC13_US]]
 ; CHECK:       for.inc13.us:
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT4]] = add nuw nsw i64 [[INDVARS_IV3]], 1
-; CHECK-NEXT:    [[EXITCOND6:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT4]], 4
+; CHECK-NEXT:    [[EXITCOND6:%.*]] = icmp samesign ne i64 [[INDVARS_IV_NEXT4]], 4
 ; CHECK-NEXT:    br i1 [[EXITCOND6]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_END_LOOPEXIT1:%.*]]
 ; CHECK:       for.body4.us:
 ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, [[FOR_BODY4_LR_PH_US]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_BODY4_US:%.*]] ]
@@ -237,8 +237,7 @@ define i32 @test4(i32 %a) {
 ; CHECK-NEXT:    [[CONV3:%.*]] = trunc i32 [[OR]] to i8
 ; CHECK-NEXT:    [[CALL:%.*]] = call i32 @fn1(i8 signext [[CONV3]])
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i32 [[INDVARS_IV]], -1
-; CHECK-NEXT:    [[TMP0:%.*]] = trunc nuw i32 [[INDVARS_IV_NEXT]] to i8
-; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[TMP0]], -14
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i32 [[INDVARS_IV_NEXT]], 242
 ; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
 ; CHECK:       for.end:
 ; CHECK-NEXT:    ret i32 0
@@ -517,8 +516,8 @@ define i32 @test10(i32 %v) {
 ; CHECK-NEXT:    [[TMP0:%.*]] = mul nsw i64 [[INDVARS_IV]], -1
 ; CHECK-NEXT:    [[TMP1:%.*]] = icmp eq i64 [[TMP0]], [[SEXT]]
 ; CHECK-NEXT:    call void @consume.i1(i1 [[TMP1]])
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp samesign ne i64 [[INDVARS_IV_NEXT]], 11
 ; CHECK-NEXT:    call void @consume.i64(i64 [[TMP0]])
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], 11
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[LEAVE:%.*]]
 ; CHECK:       leave:
 ; CHECK-NEXT:    ret i32 22
diff --git a/llvm/test/Transforms/IndVarSimplify/X86/pr24356.ll b/llvm/test/Transforms/IndVarSimplify/X86/pr24356.ll
index f2d938f6452d3..56b3695ba16b8 100644
--- a/llvm/test/Transforms/IndVarSimplify/X86/pr24356.ll
+++ b/llvm/test/Transforms/IndVarSimplify/X86/pr24356.ll
@@ -1,3 +1,4 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
 ; RUN: opt -S -passes=indvars -indvars-predicate-loops=0  < %s | FileCheck %s
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
@@ -7,15 +8,43 @@ target triple = "x86_64-apple-macosx10.10.0"
 
 ; Function Attrs: nounwind ssp uwtable
 define void @fn1() {
-; CHECK-LABEL: @fn1(
+; CHECK-LABEL: define void @fn1() {
+; CHECK-NEXT:  [[BB:.*]]:
+; CHECK-NEXT:    br label %[[BB4_PREHEADER:.*]]
+; CHECK:       [[BB4_PREHEADER]]:
+; CHECK-NEXT:    [[B_03:%.*]] = phi i8 [ 0, %[[BB]] ], [ [[TMP17:%.*]], %[[BB16:.*]] ]
+; CHECK-NEXT:    [[TMP9:%.*]] = icmp ne i8 [[B_03]], 0
+; CHECK-NEXT:    br i1 [[TMP9]], label %[[BB4_PREHEADER_BB18_LOOPEXIT_SPLIT_CRIT_EDGE:.*]], label %[[BB4_PREHEADER_BB4_PREHEADER_SPLIT_CRIT_EDGE:.*]]
+; CHECK:       [[BB4_PREHEADER_BB4_PREHEADER_SPLIT_CRIT_EDGE]]:
+; CHECK-NEXT:    br label %[[BB4_PREHEADER_SPLIT:.*]]
+; CHECK:       [[BB4_PREHEADER_BB18_LOOPEXIT_SPLIT_CRIT_EDGE]]:
+; CHECK-NEXT:    store i32 0, ptr @a, align 4
+; CHECK-NEXT:    br label %[[BB18_LOOPEXIT_SPLIT:.*]]
+; CHECK:       [[BB4_PREHEADER_SPLIT]]:
+; CHECK-NEXT:    br label %[[BB7:.*]]
+; CHECK:       [[BB4:.*]]:
+; CHECK-NEXT:    br i1 false, label %[[BB7]], label %[[BB16]]
+; CHECK:       [[BB7]]:
+; CHECK-NEXT:    br i1 false, label %[[BB18_LOOPEXIT:.*]], label %[[BB4]]
+; CHECK:       [[BB16]]:
+; CHECK-NEXT:    [[TMP17]] = add nuw nsw i8 [[B_03]], -1
+; CHECK-NEXT:    br i1 false, label %[[BB18_LOOPEXIT1:.*]], label %[[BB4_PREHEADER]]
+; CHECK:       [[BB18_LOOPEXIT]]:
+; CHECK-NEXT:    br label %[[BB18_LOOPEXIT_SPLIT]]
+; CHECK:       [[BB18_LOOPEXIT_SPLIT]]:
+; CHECK-NEXT:    br label %[[BB18:.*]]
+; CHECK:       [[BB18_LOOPEXIT1]]:
+; CHECK-NEXT:    [[TMP14_LCSSA5_LCSSA:%.*]] = phi i32 [ 1, %[[BB16]] ]
+; CHECK-NEXT:    store i32 [[TMP14_LCSSA5_LCSSA]], ptr @a, align 4
+; CHECK-NEXT:    br label %[[BB18]]
+; CHECK:       [[BB18]]:
+; CHECK-NEXT:    ret void
+;
 bb:
   br label %bb4.preheader
 
 bb4.preheader:                                    ; preds = %bb, %bb16
-; CHECK-LABEL:  bb4.preheader:
   %b.03 = phi i8 [ 0, %bb ], [ %tmp17, %bb16 ]
-; CHECK: %tmp9 = icmp ugt i8 %b.03, 1
-; CHECK-NOT: %tmp9 = icmp ugt i8 0, 1
 
   %tmp9 = icmp ugt i8 %b.03, 1
   br i1 %tmp9, label %bb4.preheader.bb18.loopexit.split_crit_edge, label %bb4.preheader.bb4.preheader.split_crit_edge
diff --git a/llvm/test/Transforms/IndVarSimplify/ada-loops.ll b/llvm/test/Transforms/IndVarSimplify/ada-loops.ll
index 83f2f212a4385..3111b884d686c 100644
--- a/llvm/test/Transforms/IndVarSimplify/ada-loops.ll
+++ b/llvm/test/Transforms/IndVarSimplify/ada-loops.ll
@@ -133,7 +133,7 @@ define void @kinds__urangezero(ptr nocapture %a) nounwind {
 ; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr [21 x i32], ptr [[A]], i32 0, i32 [[TMP4]]
 ; CHECK-NEXT:    store i32 0, ptr [[TMP5]], align 4
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i32 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i32 [[INDVARS_IV_NEXT]], 31
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp samesign eq i32 [[INDVARS_IV_NEXT]], 31
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[RETURN:%.*]], label [[BB]]
 ; CHECK:       return:
 ; CHECK-NEXT:    ret void
diff --git a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_cs.proftext b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_cs.proftext
index 1b9f19e7f7fa5..f9ad9a0e2fba4 100644
--- a/llvm/test/Transforms/PGOProfile/Inputs/thinlto_cs.proftext
+++ b/llvm/test/Transforms/PGOProfile/Inputs/thinlto_cs.proftext
@@ -1,72 +1,44 @@
 # CSIR level Instrumentation Flag
 :csir
-cond.llvm.11253644763537639171
-# Func Hash:
-1152921517491748863
-# Num Counters:
-1
-# Counter Values:
-200000
-
-foo
-# Func Hash:
-1720106746050921044
-# Num Counters:
-2
-# Counter Values:
-100000
-1
-
 bar
 # Func Hash:
 1299757151682747028
 # Num Counters:
 2
 # Counter Values:
-0
-0
-
-bar
-# Func Hash:
-29667547796
-# Num Counters:
-2
-# Counter Values:
 100000
 100000
 
 main
 # Func Hash:
-1152921517491748863
+1895182923573755903
 # Num Counters:
 1
 # Counter Values:
 1
 
-main
+cspgo_bar.c;clobber
 # Func Hash:
 1895182923573755903
 # Num Counters:
 1
 # Counter Values:
-1
+200000
 
-cspgo.c:foo
+cspgo_bar.c;cond
 # Func Hash:
-1720106746050921044
+1895182923573755903
 # Num Counters:
-4
-# Counter Values:
-100000
-100000
-0
 1
+# Counter Values:
+200000
 
-cspgo_bar.c:cond
+cspgo.c;foo
 # Func Hash:
-12884901887
+2216626667076672412
 # Num Counters:
-1
+2
 # Counter Values:
-200000
+100000
+1
 
diff --git a/llvm/test/Transforms/PGOProfile/thinlto_cspgo_use.ll b/llvm/test/Transforms/PGOProfile/thinlto_cspgo_use.ll
index 6d35946a28ff3..3dcad34205d4b 100644
--- a/llvm/test/Transforms/PGOProfile/thinlto_cspgo_use.ll
+++ b/llvm/test/Transforms/PGOProfile/thinlto_cspgo_use.ll
@@ -15,8 +15,8 @@
 
 ; CSUSE: {{![0-9]+}} = !{i32 1, !"ProfileSummary", {{![0-9]+}}}
 ; CSUSE: {{![0-9]+}} = !{i32 1, !"CSProfileSummary", {{![0-9]+}}}
-; CSUSE-DAG: {{![0-9]+}} = !{!"branch_weights", i32 100000, i32 0}
-; CSUSE-DAG: {{![0-9]+}} = !{!"branch_weights", i32 0, i32 100000}
+; CSUSE-DAG: {{![0-9]+}} = !{!"branch_weights", i32 100000, i32 100000}
+; CSUSE-DAG: {{![0-9]+}} = !{!"branch_weights", i32 1, i32 100000}
 
 source_filename = "cspgo.c"
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
diff --git a/llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll b/llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll
index eda54d999a79f..3340862ca4cfb 100644
--- a/llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll
+++ b/llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll
@@ -156,7 +156,7 @@ define void @cond_select_loop(ptr noalias nocapture noundef readonly %a, ptr noa
 ; CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds nuw float, ptr [[C]], i64 [[I_07]]
 ; CHECK-NEXT:    store float [[COND]], ptr [[ARRAYIDX4]], align 4
 ; CHECK-NEXT:    [[INC]] = add nuw nsw i64 [[I_07]], 1
-; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[INC]], 1000
+; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp samesign eq i64 [[I_07]], 999
 ; CHECK-NEXT:    br i1 [[EXITCOND_NOT]], label [[FOR_END:%.*]], label [[FOR_BODY]]
 ; CHECK:       for.end:
 ; CHECK-NEXT:    ret void
diff --git a/llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll b/llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll
index 5386bf939918a..2fc739f2a492b 100644
--- a/llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll
+++ b/llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll
@@ -52,8 +52,8 @@ define dso_local zeroext i32 @foo(ptr noundef %a) #0 {
 ; CHECK-NEXT:    [[TMP7:%.*]] = load i32, ptr [[ADD_PTR_7]], align 4, !tbaa [[TBAA3]]
 ; CHECK-NEXT:    [[ADD_7]] = add i32 [[TMP7]], [[ADD_6]]
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT_7]] = add nuw nsw i64 [[INDVARS_IV]], 8
-; CHECK-NEXT:    [[EXITCOND_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_7]], 32
-; CHECK-NEXT:    br i1 [[EXITCOND_NOT_7]], label [[FOR_BODY4_1:%.*]], label [[FOR_BODY4]], !llvm.loop [[LOOP7:![0-9]+]]
+; CHECK-NEXT:    [[CMP2_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV]], 24
+; CHECK-NEXT:    br i1 [[CMP2_NOT_7]], label [[FOR_BODY4_1:%.*]], label [[FOR_BODY4]], !llvm.loop [[LOOP7:![0-9]+]]
 ; CHECK:       for.body4.1:
 ; CHECK-NEXT:    [[INDVARS_IV_1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT_1_7:%.*]], [[FOR_BODY4_1]] ], [ 0, [[FOR_BODY4]] ]
 ; CHECK-NEXT:    [[SUM_11_1:%.*]] = phi i32 [ [[ADD_1_7:%.*]], [[FOR_BODY4_1]] ], [ [[ADD_7]], [[FOR_BODY4]] ]
@@ -91,8 +91,8 @@ define dso_local zeroext i32 @foo(ptr noundef %a) #0 {
 ; CHECK-NEXT:    [[TMP23:%.*]] = shl i32 [[TMP22]], 1
 ; CHECK-NEXT:    [[ADD_1_7]] = add i32 [[TMP23]], [[SUM_11_1]]
 ; ...
[truncated]

nikic · 2025-06-19T19:56:55Z

What's OSR?

SergeyShch01 · 2025-06-20T04:09:40Z

What's OSR?

OSR stands for Operation Strength Reduction (classical optimization reducing number of induction variables in a loop and optimizing consumers of induction variables) done by LoopStrengthReduce pass

…ible to facilitate better work of OSR (which makes limited changes to the IVUse formula for other predicates). Regression tests are updated accordingly.

SergeyShch01 · 2025-06-24T03:41:10Z

@preames Could you please review this small patch?

preames

This looks like this is duplicates large parts of LFTR without the safety checks. Please explain why LFTR doesn't handle the cases you care about.

SergeyShch01 · 2025-06-24T15:10:59Z

This looks like this is duplicates large parts of LFTR without the safety checks. Please explain why LFTR doesn't handle the cases you care about.

This patch and LFTR have intersection - but it's not a duplicate. They together cover all reasonable target cases, while each of them alone addresses only some part.

LFTR:

works w/ loop exits only
Supports a single case when a 'unique condition' is produced by icmp: at last loop iteration
works w/ any kind of loop exit for which trip count can be calculated w/o high cost expansion
reuses existing IV suitable for loop counter - if it's missing then LFTR isn't applied

This patch:

works w/ any icmp (not necessarily writing to exit branch) - so they can be unaligned w/ loop exits
Supports three cases when a 'unique condition' is produced by icmp: at first iteration, at last iteration, when IV value is equal to second argument of icmp
works only w/ icmp reading IV
keep icmp using its original IV

The original motivation for this patch was the following loop where two loop exits were if-converted by CFGSimplify and then LFTR failed to calculate loop trip count. And also I generalized the idea and supported two other cases when EQ/NE predicate can be used.

// motivation example
void test(int* data) {
 for (unsigned i = 1, j = 2; (data[i] & 8) && (i < 300); i++, j+=2) 
     data[j] = data[j - 1];
}

SergeyShch01 · 2025-06-27T04:16:27Z

Ping

SergeyShch01 · 2025-07-07T05:48:59Z

Ping

SergeyShch01 · 2025-07-14T09:21:29Z

Ping

…ible to facilitate better work of OSR (which makes limited changes to the IVUse formula for other predicates). Regression tests are updated accordingly.

…oject into force_eq_ne

SergeyShch01 · 2025-07-28T13:57:19Z

@preames Ping

SergeyShch01 · 2025-08-11T16:27:35Z

@dtcxzyw could you please review this change?

SergeyShch01 · 2025-08-19T04:32:01Z

@nikic , @dtcxzyw Dear maintainers, could you please review this change?

SergeyShch01 · 2025-11-06T17:02:27Z

@topperc, @arsenm, @dtcxzyw, @artagnon

Hello, sorry to bother you, but I have a change to SimplifyIndVar that hasn't been reviewed for the last four months. I just selected you as the latest contributor to this file.

Could you please review this change? Thanks!

preames · 2025-12-02T19:11:57Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

+      // compute the number of consecutive iterations in which produced
+      // predicate value will be the same
+      bool ExitIfTrue = false;
+      auto EL = SE->computeExitLimitFromCond(L, ICmp, ExitIfTrue, false);


From the docs: "Compute the number of times the backedge of the specified loop will execute if its exit condition were a conditional branch of ExitCond.". Nothing in this code ensures that the condition is branched on by a loop exiting branch. As such, the use here is invalid.

The major case which comes up here is when the condition produces poison on some iteration, but is only possible branched on before that iteration. (i.e. the condition is used down some conditional path)

Please note that the statement from the docs doesn't describe a pre-condition or a requirement. It actually describes how the interface works: it calculates the number of iterations, assuming a hypothetical situation where a specified condition controls the exit from the loop.

Actually the interface works w/ the methmatical abstraction which looks for the first iteration at which the condition evaluation gives a different result. This is exactly what is needed for this particular use case, so to my mind it's correct to utilize this interface here.

The major case which comes up here is when the condition produces poison on some iteration, but is only possible branched on before that iteration. (i.e. the condition is used down some conditional path)

If the condition produces poison then it can do so only after the iteration at which hypothetical branch controlled by it exits the loop (as in opposite case SE->computeExitLimitFromCond doesn't return correct value). But this is totally ok as the value returned by this interface is only compared w/ the loop trip count and cases when they aren't equal are dropped (so we shouldn't care about iterations after loop exit).

preames · 2025-12-02T19:14:45Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp


-  ++NumElimCmp;
-  Changed = true;
+  if (IsDead) {


The choice to have the same sign case no longer increment NumElimCmp is an unrelated change, and should be done separately if desired.

See #170514

Thanks for pushing this

preames · 2025-12-02T19:16:26Z

llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll

 ; CHECK-NEXT:    [[OUTER_IV_1:%.*]] = phi i32 [ 1, %[[ENTRY]] ], [ [[OUTER_IV_0]], %[[OUTER_LATCH]] ]
 ; CHECK-NEXT:    [[SHR28:%.*]] = lshr i32 [[OUTER_IV_1]], 1
-; CHECK-NEXT:    [[PRE:%.*]] = icmp samesign ult i32 [[OUTER_IV_1]], 2
+; CHECK-NEXT:    [[PRE:%.*]] = icmp samesign eq i32 [[OUTER_IV_1]], 1


This change is highly suspicious. 1 was previously not a value which evaluated to true, and now is?

Here the AR is {1; +; 1}.

Before: AR < 2
After: AR == 1

This looks like a correct transformation

preames · 2025-12-02T19:27:39Z

llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll

 ; CHECK-NEXT:    [[TMP0:%.*]] = mul nsw i64 [[INDVARS_IV]], -1
 ; CHECK-NEXT:    [[TMP1:%.*]] = icmp eq i64 [[TMP0]], [[SEXT]]
 ; CHECK-NEXT:    call void @consume.i1(i1 [[TMP1]])
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp samesign ne i64 [[INDVARS_IV_NEXT]], 11


Something is highly suspicious here - nowhere in your change should we be setting samesign on the ne predate?

LangRef doesn't disallow samesign on ne/eq, but in other spots we seem to be deliberately setting it on equality comparisons in instcombine and CVP. We should probably figure out why.

As an aside, it does look like we have a small improvement which could be separated out here. We're not setting samesign on an predicate which is already unsigned, but both operands are known positive. We do this in e.g. instcombine and CVP. Doing it here seems straight forward except for a possible compile time concern?

The small improvement is up for review here: #170363. I've got a couple NFCs planned, but that's the functional change.

The samesign on ne issue is likely just needing to drop samesign when changing the predicate in your new routine. I'd missed in previous comment that we could first do same sign, then fall into your new routine.

This change has gone after your commit

preames · 2025-12-02T19:33:32Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

+
+  if (!ICmpInst::isEquality(NewPredicate)) {
+    const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SE->getSCEV(IVOperand));
+    if (!AR) {


AddRec could be in a different loop than ICmp. Either a parent, or a sibling.

Thanks, fixed

preames · 2025-12-02T19:44:50Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

+            OrigPredicate, IVStart->getValue(), BoundC->getValue(), DL);
+        Constant *SecondIterResult = ConstantFoldCompareInstOperands(
+            OrigPredicate, SecondIterIV, BoundC->getValue(), DL);
+        if (FirstIterResult != SecondIterResult) {


This is an interesting little bit of logic. If I'm following correctly, you're claiming that for an AR which is nuw/nsw, if the non-equality condition changes between iteration 0/1 that the value of that condition on all iterations 2+ must be either a) the value at iteration 1, or poison? I believe that holds.

Though, shouldn't this be a subset of computeExitCountExhaustively? (Ignore the bit about not controlling an exit for the moment.)

Do you actually see this case come up much? I don't think the "first iteration is special" case is one we've optimized much for beyond basic peeling.

If I'm following correctly, you're claiming that for an AR which is nuw/nsw, if the non-equality condition changes between iteration 0/1 that the value of that condition on all iterations 2+ must be either a) the value at iteration 1, or poison? I believe that holds.

yes

Though, shouldn't this be a subset of computeExitCountExhaustively? (Ignore the bit about not controlling an exit for the moment.)

In context of forceEqualityForICmp() this can be ICmp not controlling loop exit.

Do you actually see this case come up much? I don't think the "first iteration is special" case is one we've optimized much for beyond basic peeling.

I don't know the statistics, but for example the change in llvm/test/Transforms/IndVarSimplify/AArch64/loop-guards.ll your asked about above is due to this rule (and there a loop-local branch reads the condition). Please note that the motivation behind this change is not to optimize the first iteration but to optimize the loop body itself (as the comparison will be executed on all loop iterations).

preames · 2025-12-02T23:09:27Z

For future reference, after applying a local fix to avoid setting samesign on equalities, and manually filtering out the improvements from setting samesign on strictly negative operands, I've manually gone through the remainder of the test differences included in the review. It's not clear there's much value demonstrated, in particular, the majority of the tests do involve loop exits, and are thus candidates for LFTR. Several of the test updates appear unlikely to be correct.

@SergeyShch01 - I would suggest rebasing after #170363 has landed. I want to be clear that I am skeptical that this patch will ever land. This is not simply a case of fixing a few small things; the problem is with the approach and the fact that you're proposing duplicating some extremely complicated and risky code. We're not going to do that without compelling reason, and I don't see the examples in tests to be motivating.

On your motivating C example, can you explain what you actual goal is? I can confirm that your patch converts one of signed compare to an equality, but what are you actually hoping to unblock here? You mention computing a trip count, but the loop seems to be unanalyzable both with and without your patch?

…e_eq_ne

github-actions · 2025-12-04T09:55:05Z

🐧 Linux x64 Test Results

186985 tests passed
4924 tests skipped

✅ The build succeeded and all tests passed.

…oject into force_eq_ne

SergeyShch01 · 2025-12-04T16:32:18Z

I would suggest rebasing after #170363 has landed.

ok, done

This is not simply a case of fixing a few small things; the problem is with the approach and the fact that you're proposing duplicating some extremely complicated and risky code

Please note that all introduced logic is actually high-level and can be clearly understood/verified logically. Regular LLVM IR interfaces are used for low-level analysis. Assuming that the regular interfaces are correct then the only potentially dangerous code is that high-level logic which should be simple to verify.

On your motivating C example, can you explain what you actual goal is?

LoopStrengthReducePass is able to optimize only ICmp w/ EQ condition, otherwise it behaves badly. So this change is to facilitate better work of this pass. In my C example this will lead to instruction count reduction.

You mention computing a trip count, but the loop seems to be unanalyzable both with and without your patch?

Yes, this is a reason of why LFTR fails and why we need forceEqualityForICmp()

llvmbot added backend:SystemZ PGO Profile Guided Optimizations llvm:transforms labels Jun 19, 2025

SergeyShch01 added 2 commits June 23, 2025 13:17

[SimplifyIndVar] Changes to convert ICMP predicate to EQ/NE when poss…

836f2d2

…ible to facilitate better work of OSR (which makes limited changes to the IVUse formula for other predicates). Regression tests are updated accordingly.

Typo fixed

4a3233f

SergeyShch01 force-pushed the force_eq_ne branch from 6f4f1b9 to 4a3233f Compare June 23, 2025 09:18

Regression test update to account for rebase

ad728d6

preames requested changes Jun 24, 2025

View reviewed changes

SergeyShch01 requested a review from preames July 1, 2025 19:00

SergeyShch01 and others added 7 commits July 24, 2025 12:21

[SimplifyIndVar] Changes to convert ICMP predicate to EQ/NE when poss…

09e7244

…ible to facilitate better work of OSR (which makes limited changes to the IVUse formula for other predicates). Regression tests are updated accordingly.

Typo fixed

96eda2e

Regression test update to account for rebase

6dfe0c0

Account for rebase

9a3dae9

origin force_eq_ne origin force_eq_origin force_eq_ne into force_eq_ne

b2b5838

Merge branch 'main' into force_eq_ne

9b9a224

Merge branch 'force_eq_ne' of https://github.com/SergeyShch01/llvm-pr…

a1ecebb

…oject into force_eq_ne

SergeyShch01 added 2 commits August 11, 2025 07:22

Merge remote-tracking branch 'upstream/main' into force_eq_ne

9be6ea9

test update after merge

9812774

Merge remote-tracking branch 'upstream/main' into force_eq_ne

0971035

merge w/ upstream

faa06b4

dtcxzyw requested review from antoniofrighetto, arsenm, artagnon, fhahn and topperc November 8, 2025 10:41

preames requested changes Dec 2, 2025

View reviewed changes

Merge branch 'main' of https://github.com/llvm/llvm-project into forc…

4e35257

…e_eq_ne

SergeyShch01 added 3 commits December 4, 2025 04:42

tests updated after merge

d4d7ddd

AddRec loop check is added

616c380

Merge branch 'force_eq_ne' of https://github.com/SergeyShch01/llvm-pr…

ef63b9a

…oject into force_eq_ne

SergeyShch01 requested a review from preames December 4, 2025 16:32

[SimplifyIndVar] ICMP predicate conversion to EQ/NE #144945

Are you sure you want to change the base?

[SimplifyIndVar] ICMP predicate conversion to EQ/NE #144945

Uh oh!

Conversation

SergeyShch01 commented Jun 19, 2025

Uh oh!

github-actions bot commented Jun 19, 2025

Uh oh!

llvmbot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Jun 19, 2025

Uh oh!

SergeyShch01 commented Jun 20, 2025

Uh oh!

SergeyShch01 commented Jun 24, 2025

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

SergeyShch01 commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SergeyShch01 commented Jun 27, 2025

Uh oh!

SergeyShch01 commented Jul 7, 2025

Uh oh!

SergeyShch01 commented Jul 14, 2025

Uh oh!

SergeyShch01 commented Jul 28, 2025

Uh oh!

SergeyShch01 commented Aug 11, 2025

Uh oh!

SergeyShch01 commented Aug 19, 2025

Uh oh!

SergeyShch01 commented Nov 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SergeyShch01 Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

preames Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

preames commented Dec 2, 2025

Uh oh!

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

SergeyShch01 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

llvmbot commented Jun 19, 2025 •

edited

Loading

SergeyShch01 commented Jun 24, 2025 •

edited

Loading

SergeyShch01 Dec 4, 2025 •

edited

Loading

preames Dec 2, 2025 •

edited

Loading

github-actions bot commented Dec 4, 2025 •

edited

Loading

SergeyShch01 commented Dec 4, 2025 •

edited

Loading