-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[SimplifyCFG] Fold the contiguous wrapping cases into ICmp. #161000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-coroutines @llvm/pr-subscribers-llvm-transforms Author: dianqk (dianqk) ChangesFixes #157113. Take the following IR as an example; we know the destination of the define i32 @<!-- -->src(i8 range(i8 0, 6) %arg) {
switch i8 %arg, label %else [
i8 0, label %if
i8 4, label %if
i8 5, label %if
]
if:
ret i32 0
else:
ret i32 1
} We can first try the non-wrapping range for both destinations, but I don't see how that would be any better. Proof: https://alive2.llvm.org/ce/z/acdWD4. Full diff: https://github.com/llvm/llvm-project/pull/161000.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 5e719c6c8cbb7..b02aed3f8f5ea 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -5761,15 +5761,49 @@ bool SimplifyCFGOpt::simplifyUnreachable(UnreachableInst *UI) {
return Changed;
}
-static bool casesAreContiguous(SmallVectorImpl<ConstantInt *> &Cases) {
+static bool casesAreContiguous(Value *Condition,
+ SmallVectorImpl<ConstantInt *> &Cases,
+ ConstantInt *&ContiguousCasesMin,
+ ConstantInt *&ContiguousCasesMax,
+ bool &IsWrapping) {
assert(Cases.size() >= 1);
array_pod_sort(Cases.begin(), Cases.end(), constantIntSortPredicate);
- for (size_t I = 1, E = Cases.size(); I != E; ++I) {
- if (Cases[I - 1]->getValue() != Cases[I]->getValue() + 1)
+ auto Min = Cases.back()->getValue();
+ auto Max = Cases.front()->getValue();
+ auto Offset = Max - Min;
+ auto ContiguousOffset = Cases.size() - 1;
+ if (Offset == ContiguousOffset) {
+ ContiguousCasesMin = Cases.back();
+ ContiguousCasesMax = Cases.front();
+ return true;
+ }
+ ConstantRange CR = computeConstantRange(Condition, /*ForSigned=*/false);
+ // If this is a wrapping contiguous range, that is, [Min, OtherMin] +
+ // [OtherMax, Max] (also [OtherMax, OtherMin]), [OtherMin+1, OtherMax-1] is a
+ // contiguous range for the other destination. N.B. If CR is not a full range,
+ // Max+1 is not equal to Min. It's not continuous in arithmetic.
+ if (Max == CR.getUnsignedMax() && Min == CR.getUnsignedMin()) {
+ assert(Cases.size() >= 2);
+ auto *It =
+ std::adjacent_find(Cases.begin(), Cases.end(), [](auto L, auto R) {
+ return L->getValue() != R->getValue() + 1;
+ });
+ if (It == Cases.end())
return false;
+ auto *OtherMax = *It;
+ auto *OtherMin = *(It + 1);
+ if ((Max - OtherMax->getValue()) + (OtherMin->getValue() - Min) ==
+ Cases.size() - 2) {
+ ContiguousCasesMin = cast<ConstantInt>(
+ ConstantInt::get(OtherMin->getType(), OtherMin->getValue() + 1));
+ ContiguousCasesMax = cast<ConstantInt>(
+ ConstantInt::get(OtherMax->getType(), OtherMax->getValue() - 1));
+ IsWrapping = true;
+ return true;
+ }
}
- return true;
+ return false;
}
static void createUnreachableSwitchDefault(SwitchInst *Switch,
@@ -5840,25 +5874,34 @@ bool SimplifyCFGOpt::turnSwitchRangeIntoICmp(SwitchInst *SI,
assert(!CasesA.empty() || HasDefault);
// Figure out if one of the sets of cases form a contiguous range.
- SmallVectorImpl<ConstantInt *> *ContiguousCases = nullptr;
+ ConstantInt *ContiguousCasesMin = nullptr;
+ ConstantInt *ContiguousCasesMax = nullptr;
BasicBlock *ContiguousDest = nullptr;
BasicBlock *OtherDest = nullptr;
- if (!CasesA.empty() && casesAreContiguous(CasesA)) {
- ContiguousCases = &CasesA;
+ bool IsWrapping = false;
+ if (!CasesA.empty() &&
+ casesAreContiguous(SI->getCondition(), CasesA, ContiguousCasesMin,
+ ContiguousCasesMax, IsWrapping)) {
ContiguousDest = DestA;
OtherDest = DestB;
- } else if (casesAreContiguous(CasesB)) {
- ContiguousCases = &CasesB;
+ } else if (casesAreContiguous(SI->getCondition(), CasesB, ContiguousCasesMin,
+ ContiguousCasesMax, IsWrapping)) {
ContiguousDest = DestB;
OtherDest = DestA;
} else
return false;
+ if (IsWrapping)
+ std::swap(ContiguousDest, OtherDest);
+
// Start building the compare and branch.
- Constant *Offset = ConstantExpr::getNeg(ContiguousCases->back());
- Constant *NumCases =
- ConstantInt::get(Offset->getType(), ContiguousCases->size());
+ auto ContiguousCasesSize =
+ (ContiguousCasesMax->getValue() - ContiguousCasesMin->getValue())
+ .getZExtValue() +
+ 1;
+ Constant *Offset = ConstantExpr::getNeg(ContiguousCasesMin);
+ Constant *NumCases = ConstantInt::get(Offset->getType(), ContiguousCasesSize);
Value *Sub = SI->getCondition();
if (!Offset->isNullValue())
@@ -5866,7 +5909,7 @@ bool SimplifyCFGOpt::turnSwitchRangeIntoICmp(SwitchInst *SI,
Value *Cmp;
// If NumCases overflowed, then all possible values jump to the successor.
- if (NumCases->isNullValue() && !ContiguousCases->empty())
+ if (NumCases->isNullValue() && ContiguousCasesSize != 0)
Cmp = ConstantInt::getTrue(SI->getContext());
else
Cmp = Builder.CreateICmpULT(Sub, NumCases, "switch");
@@ -5895,14 +5938,14 @@ bool SimplifyCFGOpt::turnSwitchRangeIntoICmp(SwitchInst *SI,
// Prune obsolete incoming values off the successors' PHI nodes.
for (auto BBI = ContiguousDest->begin(); isa<PHINode>(BBI); ++BBI) {
- unsigned PreviousEdges = ContiguousCases->size();
+ unsigned PreviousEdges = ContiguousCasesSize;
if (ContiguousDest == SI->getDefaultDest())
++PreviousEdges;
for (unsigned I = 0, E = PreviousEdges - 1; I != E; ++I)
cast<PHINode>(BBI)->removeIncomingValue(SI->getParent());
}
for (auto BBI = OtherDest->begin(); isa<PHINode>(BBI); ++BBI) {
- unsigned PreviousEdges = SI->getNumCases() - ContiguousCases->size();
+ unsigned PreviousEdges = SI->getNumCases() - ContiguousCasesSize;
if (OtherDest == SI->getDefaultDest())
++PreviousEdges;
for (unsigned I = 0, E = PreviousEdges - 1; I != E; ++I)
diff --git a/llvm/test/Transforms/SimplifyCFG/switch-range-to-icmp.ll b/llvm/test/Transforms/SimplifyCFG/switch-range-to-icmp.ll
index 8f2ae2d054f1e..c6904fe273f09 100644
--- a/llvm/test/Transforms/SimplifyCFG/switch-range-to-icmp.ll
+++ b/llvm/test/Transforms/SimplifyCFG/switch-range-to-icmp.ll
@@ -188,4 +188,97 @@ exit:
ret void
}
+define i32 @wrapping_known_range(i8 range(i8 0, 6) %arg) {
+; CHECK-LABEL: @wrapping_known_range(
+; CHECK-NEXT: [[ARG_OFF:%.*]] = add i8 [[ARG:%.*]], -1
+; CHECK-NEXT: [[SWITCH:%.*]] = icmp ult i8 [[ARG_OFF]], 3
+; CHECK-NEXT: br i1 [[SWITCH]], label [[ELSE:%.*]], label [[IF:%.*]]
+; CHECK: common.ret:
+; CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi i32 [ [[I0:%.*]], [[IF]] ], [ [[I1:%.*]], [[ELSE]] ]
+; CHECK-NEXT: ret i32 [[COMMON_RET_OP]]
+; CHECK: if:
+; CHECK-NEXT: [[I0]] = call i32 @f(i32 0)
+; CHECK-NEXT: br label [[COMMON_RET:%.*]]
+; CHECK: else:
+; CHECK-NEXT: [[I1]] = call i32 @f(i32 1)
+; CHECK-NEXT: br label [[COMMON_RET]]
+;
+ switch i8 %arg, label %else [
+ i8 0, label %if
+ i8 4, label %if
+ i8 5, label %if
+ ]
+
+if:
+ %i0 = call i32 @f(i32 0)
+ ret i32 %i0
+
+else:
+ %i1 = call i32 @f(i32 1)
+ ret i32 %i1
+}
+
+define i32 @wrapping_range(i8 %arg) {
+; CHECK-LABEL: @wrapping_range(
+; CHECK-NEXT: [[ARG_OFF:%.*]] = add i8 [[ARG:%.*]], -1
+; CHECK-NEXT: [[SWITCH:%.*]] = icmp ult i8 [[ARG_OFF]], -4
+; CHECK-NEXT: br i1 [[SWITCH]], label [[ELSE:%.*]], label [[IF:%.*]]
+; CHECK: common.ret:
+; CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi i32 [ [[I0:%.*]], [[IF]] ], [ [[I1:%.*]], [[ELSE]] ]
+; CHECK-NEXT: ret i32 [[COMMON_RET_OP]]
+; CHECK: if:
+; CHECK-NEXT: [[I0]] = call i32 @f(i32 0)
+; CHECK-NEXT: br label [[COMMON_RET:%.*]]
+; CHECK: else:
+; CHECK-NEXT: [[I1]] = call i32 @f(i32 1)
+; CHECK-NEXT: br label [[COMMON_RET]]
+;
+ switch i8 %arg, label %else [
+ i8 0, label %if
+ i8 -3, label %if
+ i8 -2, label %if
+ i8 -1, label %if
+ ]
+
+if:
+ %i0 = call i32 @f(i32 0)
+ ret i32 %i0
+
+else:
+ %i1 = call i32 @f(i32 1)
+ ret i32 %i1
+}
+
+define i32 @no_continuous_wrapping_range(i8 %arg) {
+; CHECK-LABEL: @no_continuous_wrapping_range(
+; CHECK-NEXT: switch i8 [[ARG:%.*]], label [[ELSE:%.*]] [
+; CHECK-NEXT: i8 0, label [[IF:%.*]]
+; CHECK-NEXT: i8 -3, label [[IF]]
+; CHECK-NEXT: i8 -1, label [[IF]]
+; CHECK-NEXT: ]
+; CHECK: common.ret:
+; CHECK-NEXT: [[COMMON_RET_OP:%.*]] = phi i32 [ [[I0:%.*]], [[IF]] ], [ [[I1:%.*]], [[ELSE]] ]
+; CHECK-NEXT: ret i32 [[COMMON_RET_OP]]
+; CHECK: if:
+; CHECK-NEXT: [[I0]] = call i32 @f(i32 0)
+; CHECK-NEXT: br label [[COMMON_RET:%.*]]
+; CHECK: else:
+; CHECK-NEXT: [[I1]] = call i32 @f(i32 1)
+; CHECK-NEXT: br label [[COMMON_RET]]
+;
+ switch i8 %arg, label %else [
+ i8 0, label %if
+ i8 -3, label %if
+ i8 -1, label %if
+ ]
+
+if:
+ %i0 = call i32 @f(i32 0)
+ ret i32 %i0
+
+else:
+ %i1 = call i32 @f(i32 1)
+ ret i32 %i1
+}
+
declare void @bar(ptr nonnull dereferenceable(4))
|
@zyw-bot csmith-quick-fuzz |
Maybe limit this to >2 cases to start to avoid regressions? |
bb9720c
to
9864e17
Compare
This is still needed, some 2 cases have not been transformed: https://llvm.godbolt.org/z/Y71EovY4M. I think it's fine because it runs later in the passes pipeline. |
I improved this with only one case also. |
// Max+1 is not equal to Min. It's not continuous in arithmetic. | ||
if (Max == CR.getUnsignedMax() && Min == CR.getUnsignedMin()) { | ||
assert(Cases.size() >= 2); | ||
auto *It = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something as follows before the logic here?
// Find the first non-consecutive pair, and ensure this pair
// happens to be unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I meant this before the std::adjacent_find, which is not appearing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what you want. std::adjacent_find
finds one non-consecutive pair and checks their distance to check if this pair is unique.
auto Min = Cases.back()->getValue(); | ||
auto Max = Cases.front()->getValue(); | ||
auto Offset = Max - Min; | ||
auto ContiguousOffset = Cases.size() - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for auto here, having the type explicit (const APInt&, unsigned) would make the code more readable.
auto *OtherMax = *It; | ||
auto *OtherMin = *(It + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto *OtherMax = *It; | |
auto *OtherMin = *(It + 1); | |
auto [OtherMax, OtherMin] = std::make_pair(*It, *std::next(It)); |
|
||
static bool casesAreContiguous(SmallVectorImpl<ConstantInt *> &Cases) { | ||
static bool casesAreContiguous(Value *Condition, | ||
SmallVectorImpl<ConstantInt *> &Cases, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the function name would deserve an update, e.g., findContiguousCases
(and maybe return a std::optional struct instead of the three values as out parameters)?
if (auto Result = findContiguousCases(SI->getCondition(), CasesA, CasesB, | ||
DestA, DestB)) | ||
ContiguousCases = *Result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (auto Result = findContiguousCases(SI->getCondition(), CasesA, CasesB, | |
DestA, DestB)) | |
ContiguousCases = *Result; | |
ContiguousCases = findContiguousCases(SI->getCondition(), CasesA, CasesB, | |
DestA, DestB); |
if (auto Result = findContiguousCases(SI->getCondition(), CasesB, CasesA, | ||
DestB, DestA)) | ||
ContiguousCases = *Result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (auto Result = findContiguousCases(SI->getCondition(), CasesB, CasesA, | |
DestB, DestA)) | |
ContiguousCases = *Result; | |
ContiguousCases = findContiguousCases(SI->getCondition(), CasesB, CasesA, | |
DestB, DestA); |
|
||
// Start building the compare and branch. | ||
// Correctness: Cases to the default destination cannot be contiguous cases. | ||
if (!ContiguousCases && !HasDefault && !CasesA.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!ContiguousCases && !HasDefault && !CasesA.empty()) | |
else if (!HasDefault) |
!HasDefault
implies !CasesA.empty()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. TBH the cases are unnecessary to be contiguous. There can be holes between them (unreachable cases).
@zyw-bot csmith-fuzz |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/15667 Here is the relevant piece of the build log for the reference
|
This patch implements the todo discussed in #158495 (comment). It also fixes a regression introduced by #161000. See also dtcxzyw/llvm-opt-benchmark#2890 (comment). IR diff: dtcxzyw/llvm-opt-benchmark#2892
This patch implements the todo discussed in llvm/llvm-project#158495 (comment). It also fixes a regression introduced by llvm/llvm-project#161000. See also dtcxzyw/llvm-opt-benchmark#2890 (comment). IR diff: dtcxzyw/llvm-opt-benchmark#2892
…1000) Fixes llvm#157113. Take the following IR as an example; we know the destination of the `[1, 3]` cases is `%else`. ```llvm define i32 @src(i8 range(i8 0, 6) %arg) { switch i8 %arg, label %else [ i8 0, label %if i8 4, label %if i8 5, label %if ] if: ret i32 0 else: ret i32 1 } ``` We can first try the non-wrapping range for both destinations, but I don't see how that would be any better. Proof: https://alive2.llvm.org/ce/z/acdWD4.
This patch implements the todo discussed in llvm#158495 (comment). It also fixes a regression introduced by llvm#161000. See also dtcxzyw/llvm-opt-benchmark#2890 (comment). IR diff: dtcxzyw/llvm-opt-benchmark#2892
I am seeing a "Broken module" error when building the Linux kernel with full LTO (it shows up for AArch64 /
I was able to find two
If there is any other information I can provide, I am more than happy to do so. |
Reproduced. I will provide a smaller IR reproducer. |
Fixes #157113.
Take the following IR as an example; we know the destination of the
[1, 3]
cases is%else
.We can first try the non-wrapping range for both destinations, but I don't see how that would be any better.
Proof: https://alive2.llvm.org/ce/z/acdWD4.