Skip to content

Conversation

@dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Nov 24, 2025

In ConstraintElimination, shr exact cannot be decomposed since we only support integer factors. However, when both sides of a comparison have at least N tailing zeros, we can create an equivalent comparison with shifted operands. This allows us to eliminate shr exact and provide a more expressive constraint.

This patch handles the following cases as a start:

icmp pred (shr exact X, ShAmtC), (shr exact Y, ShAmtC) --> icmp pred X, Y
icmp pred (shr exact X, ShAmtC), C --> icmp pred X, (C << ShAmtC)

Alive2:
lshr: https://alive2.llvm.org/ce/z/Uf5u-y
ashr: https://alive2.llvm.org/ce/z/W-ESk5

We may further extend this fold to handle divisions/non-constant shamts in the future.

The motivating case is std::vector. Both libstdc++ and libc++'s implementations use three pointers to represent a vector and calculate size and capacity with pointer difference. Since they have multiple uses in the real-world program, InstCombine cannot eliminate the above two patterns (See a66051c and #122266). In ConstraintElimination, due to the division (or right shift), it fails to infer the relationship between pointers.

Related issue: #107000. With this patch and nsw flags on pointer differences (See also #161698 (comment)), the reallocation logic can be fully eliminated.

llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#3087
Compile-time impact looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=045af21501bee71d36a37e1abcefba1cf28842b9&to=ae3d759820d2051329e6e6a8eec2250a7158fe2f&stat=instructions:u
NOTE: Regressions are caused by losing the range information of the original ashr. Should I put this logic in addFact?

@dtcxzyw dtcxzyw requested a review from fhahn November 24, 2025 15:39
@dtcxzyw dtcxzyw marked this pull request as ready for review November 24, 2025 17:23
@llvmbot
Copy link
Member

llvmbot commented Nov 24, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

In ConstraintElimination, shr exact cannot be decomposed since we only support integer factors. However, when both sides of a comparison have at least N tailing zeros, we can create an equivalent comparison with shifted operands. This allows us to eliminate shr exact and provide a more expressive constraint.

This patch handles the following cases as a start:

icmp pred (shr exact X, ShAmtC), (shr exact Y, ShAmtC) --&gt; icmp pred X, Y
icmp pred (shr exact X, ShAmtC), C --&gt; icmp pred X, (C &lt;&lt; ShAmtC)

Alive2:
lshr: https://alive2.llvm.org/ce/z/Uf5u-y
ashr: https://alive2.llvm.org/ce/z/W-ESk5

We may further extend this fold to handle divisions/non-constant shamts in the future.

The motivating case is std::vector. Both libstdc++ and libc++'s implementations use three pointers to represent a vector and calculate size and capacity with pointer difference. Since they have multiple uses in the real-world program, InstCombine cannot eliminate the above two patterns (See a66051c and #122266). In ConstraintElimination, due to the division (or right shift), it fails to infer the relationship between pointers.

Related issue: #107000. With this patch and nsw flags on pointer differences (See also #161698 (comment)), the reallocation logic can be fully eliminated.

llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#3087
Compile-time impact looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=045af21501bee71d36a37e1abcefba1cf28842b9&amp;to=ae3d759820d2051329e6e6a8eec2250a7158fe2f&amp;stat=instructions:u
NOTE: Regressions are caused by losing the range information of the original ashr. Should I put this logic in addFact?


Full diff: https://github.com/llvm/llvm-project/pull/169343.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/ConstraintElimination.cpp (+34)
  • (added) llvm/test/Transforms/ConstraintElimination/shr-exact.ll (+198)
diff --git a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
index d347cedb42988..455d1f79d317b 100644
--- a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
@@ -718,6 +718,40 @@ ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
       Pred != CmpInst::ICMP_SLE && Pred != CmpInst::ICMP_SLT)
     return {};
 
+  // Check if we can eliminate right shifts.
+  auto ShiftOperands = [Pred](Value *&Op0, Value *&Op1) {
+    uint64_t ShAmtC;
+    Value *X;
+    if (!match(Op0, m_Exact(m_Shr(m_Value(X), m_ConstantInt(ShAmtC)))))
+      return false;
+    bool IsLShr = cast<BinaryOperator>(Op0)->getOpcode() == Instruction::LShr;
+    if (IsLShr && ICmpInst::isSigned(Pred))
+      return false;
+    // icmp pred (shr exact X, ShAmtC), (shr exact Y, ShAmtC) --> icmp pred X, Y
+    Value *Y;
+    if (match(Op1, m_Exact(m_Shr(m_Value(Y), m_SpecificInt(ShAmtC)))) &&
+        cast<BinaryOperator>(Op0)->getOpcode() ==
+            cast<BinaryOperator>(Op1)->getOpcode()) {
+      Op0 = X;
+      Op1 = Y;
+      return true;
+    }
+    // icmp pred (shr exact X, ShAmtC), C --> icmp pred X, (C << ShAmtC)
+    const APInt *RHSC;
+    if (!match(Op1, m_APInt(RHSC)))
+      return false;
+    bool Overflow = false;
+    APInt NewC = IsLShr ? RHSC->ushl_ov(ShAmtC, Overflow)
+                        : RHSC->sshl_ov(ShAmtC, Overflow);
+    if (Overflow)
+      return false;
+    Op0 = X;
+    Op1 = ConstantInt::get(Op1->getType(), NewC);
+    return true;
+  };
+  if (!ShiftOperands(Op0, Op1))
+    ShiftOperands(Op1, Op0);
+
   SmallVector<ConditionTy, 4> Preconditions;
   bool IsSigned = ForceSignedSystem || CmpInst::isSigned(Pred);
   auto &Value2Index = getValue2Index(IsSigned);
diff --git a/llvm/test/Transforms/ConstraintElimination/shr-exact.ll b/llvm/test/Transforms/ConstraintElimination/shr-exact.ll
new file mode 100644
index 0000000000000..3a5a900d060f7
--- /dev/null
+++ b/llvm/test/Transforms/ConstraintElimination/shr-exact.ll
@@ -0,0 +1,198 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=constraint-elimination -S %s | FileCheck %s
+
+define i1 @precond_icmp_ashr_and_rhsc(i64 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_rhsc(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHR:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[SHR]], 200
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %shr = ashr exact i64 %x, 3
+  %cond = icmp ult i64 %shr, 200
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i64 %x, 9223372036854775800
+  ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_ashr(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_ashr(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHRX:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT:    [[SHRY:%.*]] = ashr exact i64 [[Y]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %shrx = ashr exact i64 %x, 3
+  %shry = ashr exact i64 %y, 3
+  %cond = icmp ult i64 %shrx, %shry
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i64 %x, %y
+  ret i1 %cmp
+}
+
+define i1 @precond_icmp_lshr_and_lshr(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_and_lshr(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHRX:%.*]] = lshr exact i64 [[X]], 3
+; CHECK-NEXT:    [[SHRY:%.*]] = lshr exact i64 [[Y]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %shrx = lshr exact i64 %x, 3
+  %shry = lshr exact i64 %y, 3
+  %cond = icmp ult i64 %shrx, %shry
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i64 %x, %y
+  ret i1 %cmp
+}
+
+; Negative tests
+
+define i1 @precond_icmp_lshr_and_rhsc_overflow(i8 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_and_rhsc_overflow(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHR:%.*]] = lshr exact i8 [[X]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i8 [[SHR]], 60
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], 8
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %shr = lshr exact i8 %x, 3
+  %cond = icmp ult i8 %shr, 60
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i8 %x, 8
+  ret i1 %cmp
+}
+
+
+define i1 @precond_icmp_lshr_unknown_shamt_and_rhsc(i8 %x, i8 %shamt) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_unknown_shamt_and_rhsc(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[SHAMT:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHR:%.*]] = lshr exact i8 [[X]], [[SHAMT]]
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i8 [[SHR]], 8
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], 8
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %shr = lshr exact i8 %x, %shamt
+  %cond = icmp ult i8 %shr, 8
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i8 %x, 8
+  ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_rhsc_overflow(i8 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_rhsc_overflow(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHR:%.*]] = ashr exact i8 [[X]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i8 [[SHR]], 60
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], 8
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %shr = ashr exact i8 %x, 3
+  %cond = icmp ult i8 %shr, 60
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i8 %x, 8
+  ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_lshr(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_lshr(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHRX:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT:    [[SHRY:%.*]] = lshr exact i64 [[Y]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[X]], [[Y]]
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %shrx = ashr exact i64 %x, 3
+  %shry = lshr exact i64 %y, 3
+  %cond = icmp ult i64 %shrx, %shry
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i64 %x, %y
+  ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_ashr_mismatched_shamt(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_ashr_mismatched_shamt(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHRX:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT:    [[SHRY:%.*]] = ashr exact i64 [[Y]], 4
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[X]], [[Y]]
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %shrx = ashr exact i64 %x, 3
+  %shry = ashr exact i64 %y, 4
+  %cond = icmp ult i64 %shrx, %shry
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp eq i64 %x, %y
+  ret i1 %cmp
+}
+
+; LShr doesn't preserve the sign of the value, so we cannot perform
+; the transformation for signed comparisons.
+define i1 @precond_icmp_lshr_and_lshr_signed_pred(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_and_lshr_signed_pred(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SHRX:%.*]] = lshr exact i64 [[X]], 3
+; CHECK-NEXT:    [[SHRY:%.*]] = lshr exact i64 [[Y]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp slt i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i64 [[X]], [[Y]]
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %shrx = lshr exact i64 %x, 3
+  %shry = lshr exact i64 %y, 3
+  %cond = icmp slt i64 %shrx, %shry
+  call void @llvm.assume(i1 %cond)
+  %cmp = icmp slt i64 %x, %y
+  ret i1 %cmp
+}
+
+; Regression: After converting (ashr X, 3) u> 4 to X u> 32, we should still
+; be able to use the original condition to simplify icmp with ashr.
+define i1 @precond_icmp_ashr_and_constant_regression(i64 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_constant_regression(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[ASHR:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT:    [[COND:%.*]] = icmp ugt i64 [[ASHR]], 4
+; CHECK-NEXT:    call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i64 [[ASHR]], -1
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[ADD]], 0
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+entry:
+  %ashr = ashr exact i64 %x, 3
+  %cond = icmp ugt i64 %ashr, 4
+  call void @llvm.assume(i1 %cond)
+  %add = add nsw i64 %ashr, -1
+  %cmp = icmp eq i64 %add, 0
+  ret i1 %cmp
+}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants