-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[ConstraintElim] Eliminate exact right shifts in getConstraint #169343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Yingwei Zheng (dtcxzyw) ChangesIn ConstraintElimination, This patch handles the following cases as a start: Alive2: We may further extend this fold to handle divisions/non-constant shamts in the future. The motivating case is std::vector. Both libstdc++ and libc++'s implementations use three pointers to represent a vector and calculate size and capacity with pointer difference. Since they have multiple uses in the real-world program, InstCombine cannot eliminate the above two patterns (See a66051c and #122266). In ConstraintElimination, due to the division (or right shift), it fails to infer the relationship between pointers. Related issue: #107000. With this patch and nsw flags on pointer differences (See also #161698 (comment)), the reallocation logic can be fully eliminated. llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#3087 Full diff: https://github.com/llvm/llvm-project/pull/169343.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
index d347cedb42988..455d1f79d317b 100644
--- a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
@@ -718,6 +718,40 @@ ConstraintInfo::getConstraint(CmpInst::Predicate Pred, Value *Op0, Value *Op1,
Pred != CmpInst::ICMP_SLE && Pred != CmpInst::ICMP_SLT)
return {};
+ // Check if we can eliminate right shifts.
+ auto ShiftOperands = [Pred](Value *&Op0, Value *&Op1) {
+ uint64_t ShAmtC;
+ Value *X;
+ if (!match(Op0, m_Exact(m_Shr(m_Value(X), m_ConstantInt(ShAmtC)))))
+ return false;
+ bool IsLShr = cast<BinaryOperator>(Op0)->getOpcode() == Instruction::LShr;
+ if (IsLShr && ICmpInst::isSigned(Pred))
+ return false;
+ // icmp pred (shr exact X, ShAmtC), (shr exact Y, ShAmtC) --> icmp pred X, Y
+ Value *Y;
+ if (match(Op1, m_Exact(m_Shr(m_Value(Y), m_SpecificInt(ShAmtC)))) &&
+ cast<BinaryOperator>(Op0)->getOpcode() ==
+ cast<BinaryOperator>(Op1)->getOpcode()) {
+ Op0 = X;
+ Op1 = Y;
+ return true;
+ }
+ // icmp pred (shr exact X, ShAmtC), C --> icmp pred X, (C << ShAmtC)
+ const APInt *RHSC;
+ if (!match(Op1, m_APInt(RHSC)))
+ return false;
+ bool Overflow = false;
+ APInt NewC = IsLShr ? RHSC->ushl_ov(ShAmtC, Overflow)
+ : RHSC->sshl_ov(ShAmtC, Overflow);
+ if (Overflow)
+ return false;
+ Op0 = X;
+ Op1 = ConstantInt::get(Op1->getType(), NewC);
+ return true;
+ };
+ if (!ShiftOperands(Op0, Op1))
+ ShiftOperands(Op1, Op0);
+
SmallVector<ConditionTy, 4> Preconditions;
bool IsSigned = ForceSignedSystem || CmpInst::isSigned(Pred);
auto &Value2Index = getValue2Index(IsSigned);
diff --git a/llvm/test/Transforms/ConstraintElimination/shr-exact.ll b/llvm/test/Transforms/ConstraintElimination/shr-exact.ll
new file mode 100644
index 0000000000000..3a5a900d060f7
--- /dev/null
+++ b/llvm/test/Transforms/ConstraintElimination/shr-exact.ll
@@ -0,0 +1,198 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=constraint-elimination -S %s | FileCheck %s
+
+define i1 @precond_icmp_ashr_and_rhsc(i64 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_rhsc(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHR:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i64 [[SHR]], 200
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: ret i1 false
+;
+entry:
+ %shr = ashr exact i64 %x, 3
+ %cond = icmp ult i64 %shr, 200
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i64 %x, 9223372036854775800
+ ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_ashr(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_ashr(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHRX:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT: [[SHRY:%.*]] = ashr exact i64 [[Y]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: ret i1 false
+;
+entry:
+ %shrx = ashr exact i64 %x, 3
+ %shry = ashr exact i64 %y, 3
+ %cond = icmp ult i64 %shrx, %shry
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i64 %x, %y
+ ret i1 %cmp
+}
+
+define i1 @precond_icmp_lshr_and_lshr(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_and_lshr(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHRX:%.*]] = lshr exact i64 [[X]], 3
+; CHECK-NEXT: [[SHRY:%.*]] = lshr exact i64 [[Y]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: ret i1 false
+;
+entry:
+ %shrx = lshr exact i64 %x, 3
+ %shry = lshr exact i64 %y, 3
+ %cond = icmp ult i64 %shrx, %shry
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i64 %x, %y
+ ret i1 %cmp
+}
+
+; Negative tests
+
+define i1 @precond_icmp_lshr_and_rhsc_overflow(i8 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_and_rhsc_overflow(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHR:%.*]] = lshr exact i8 [[X]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i8 [[SHR]], 60
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[X]], 8
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %shr = lshr exact i8 %x, 3
+ %cond = icmp ult i8 %shr, 60
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i8 %x, 8
+ ret i1 %cmp
+}
+
+
+define i1 @precond_icmp_lshr_unknown_shamt_and_rhsc(i8 %x, i8 %shamt) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_unknown_shamt_and_rhsc(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[SHAMT:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHR:%.*]] = lshr exact i8 [[X]], [[SHAMT]]
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i8 [[SHR]], 8
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[X]], 8
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %shr = lshr exact i8 %x, %shamt
+ %cond = icmp ult i8 %shr, 8
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i8 %x, 8
+ ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_rhsc_overflow(i8 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_rhsc_overflow(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHR:%.*]] = ashr exact i8 [[X]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i8 [[SHR]], 60
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[X]], 8
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %shr = ashr exact i8 %x, 3
+ %cond = icmp ult i8 %shr, 60
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i8 %x, 8
+ ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_lshr(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_lshr(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHRX:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT: [[SHRY:%.*]] = lshr exact i64 [[Y]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %shrx = ashr exact i64 %x, 3
+ %shry = lshr exact i64 %y, 3
+ %cond = icmp ult i64 %shrx, %shry
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i64 %x, %y
+ ret i1 %cmp
+}
+
+define i1 @precond_icmp_ashr_and_ashr_mismatched_shamt(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_ashr_mismatched_shamt(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHRX:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT: [[SHRY:%.*]] = ashr exact i64 [[Y]], 4
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %shrx = ashr exact i64 %x, 3
+ %shry = ashr exact i64 %y, 4
+ %cond = icmp ult i64 %shrx, %shry
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp eq i64 %x, %y
+ ret i1 %cmp
+}
+
+; LShr doesn't preserve the sign of the value, so we cannot perform
+; the transformation for signed comparisons.
+define i1 @precond_icmp_lshr_and_lshr_signed_pred(i64 %x, i64 %y) {
+; CHECK-LABEL: define i1 @precond_icmp_lshr_and_lshr_signed_pred(
+; CHECK-SAME: i64 [[X:%.*]], i64 [[Y:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHRX:%.*]] = lshr exact i64 [[X]], 3
+; CHECK-NEXT: [[SHRY:%.*]] = lshr exact i64 [[Y]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp slt i64 [[SHRX]], [[SHRY]]
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[CMP:%.*]] = icmp slt i64 [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %shrx = lshr exact i64 %x, 3
+ %shry = lshr exact i64 %y, 3
+ %cond = icmp slt i64 %shrx, %shry
+ call void @llvm.assume(i1 %cond)
+ %cmp = icmp slt i64 %x, %y
+ ret i1 %cmp
+}
+
+; Regression: After converting (ashr X, 3) u> 4 to X u> 32, we should still
+; be able to use the original condition to simplify icmp with ashr.
+define i1 @precond_icmp_ashr_and_constant_regression(i64 %x) {
+; CHECK-LABEL: define i1 @precond_icmp_ashr_and_constant_regression(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[ASHR:%.*]] = ashr exact i64 [[X]], 3
+; CHECK-NEXT: [[COND:%.*]] = icmp ugt i64 [[ASHR]], 4
+; CHECK-NEXT: call void @llvm.assume(i1 [[COND]])
+; CHECK-NEXT: [[ADD:%.*]] = add nsw i64 [[ASHR]], -1
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[ADD]], 0
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+entry:
+ %ashr = ashr exact i64 %x, 3
+ %cond = icmp ugt i64 %ashr, 4
+ call void @llvm.assume(i1 %cond)
+ %add = add nsw i64 %ashr, -1
+ %cmp = icmp eq i64 %add, 0
+ ret i1 %cmp
+}
|
In ConstraintElimination,
shr exactcannot be decomposed since we only support integer factors. However, when both sides of a comparison have at least N tailing zeros, we can create an equivalent comparison with shifted operands. This allows us to eliminateshr exactand provide a more expressive constraint.This patch handles the following cases as a start:
Alive2:
lshr: https://alive2.llvm.org/ce/z/Uf5u-y
ashr: https://alive2.llvm.org/ce/z/W-ESk5
We may further extend this fold to handle divisions/non-constant shamts in the future.
The motivating case is std::vector. Both libstdc++ and libc++'s implementations use three pointers to represent a vector and calculate size and capacity with pointer difference. Since they have multiple uses in the real-world program, InstCombine cannot eliminate the above two patterns (See a66051c and #122266). In ConstraintElimination, due to the division (or right shift), it fails to infer the relationship between pointers.
Related issue: #107000.
With this patch and nsw flags on pointer differences (See also #161698 (comment)), the reallocation logic can be fully eliminated.llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#3087
Compile-time impact looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=045af21501bee71d36a37e1abcefba1cf28842b9&to=ae3d759820d2051329e6e6a8eec2250a7158fe2f&stat=instructions:u
NOTE: Regressions are caused by losing the range information of the original ashr. Should I put this logic in
addFact?