-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[InstCombine] Added optimisation for trunc (Pow2 >> x) to i1 #157030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: None (kper) ChangesCloses #156898 I have added two cases. The first one matches when the constant is exactly power of 2. The second case was to address the general case mentioned in the linked issue. I, however, did not really solve the general case. Here are a few examples which won't be working with the two cases:
I wonder whether I should still implement the general case since it increments the number of instructions? cc @nikic @andjo403 Full diff: https://github.com/llvm/llvm-project/pull/157030.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index fdef49e310f81..a3e9969503f02 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -11,11 +11,13 @@
//===----------------------------------------------------------------------===//
#include "InstCombineInternal.h"
+#include "llvm/ADT/APInt.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/Value.h"
#include "llvm/Support/KnownBits.h"
#include "llvm/Transforms/InstCombine/InstCombiner.h"
#include <optional>
@@ -969,6 +971,27 @@ Instruction *InstCombinerImpl::visitTrunc(TruncInst &Trunc) {
Changed = true;
}
+ const APInt *C1;
+ Value *V1;
+ // trunc (lshr i8 C1, V1) to i1 -> icmp eq V1, sqrt(C1) iff C1 is power of 2
+ if (DestWidth == 1 &&
+ match(Src, m_OneUse(m_Shr(m_Power2(C1), m_Value(V1))))) {
+ const APInt Sqrt = C1->sqrt();
+ Value *Right = ConstantInt::get(V1->getType(), Sqrt);
+ Value *Icmp = Builder.CreateICmpEQ(V1, Right);
+ return replaceInstUsesWith(Trunc, Icmp);
+ }
+
+ // trunc (lshr i8 C1, V1) to i1 -> icmp ult V1, sqrt(C1 + 1) iff (C1 + 1) is
+ // power of 2
+ if (DestWidth == 1 && match(Src, m_OneUse(m_Shr(m_APInt(C1), m_Value(V1)))) &&
+ (*C1 + 1).isPowerOf2()) {
+ const APInt Sqrt = (*C1 + 1).sqrt();
+ Value *Right = ConstantInt::get(V1->getType(), Sqrt);
+ Value *Icmp = Builder.CreateICmpULT(V1, Right);
+ return replaceInstUsesWith(Trunc, Icmp);
+ }
+
return Changed ? &Trunc : nullptr;
}
diff --git a/llvm/test/Transforms/InstCombine/trunc-lshr.ll b/llvm/test/Transforms/InstCombine/trunc-lshr.ll
index 4364b09cfa709..84daba3d13b9a 100644
--- a/llvm/test/Transforms/InstCombine/trunc-lshr.ll
+++ b/llvm/test/Transforms/InstCombine/trunc-lshr.ll
@@ -93,3 +93,24 @@ define i1 @test5(i32 %i, ptr %p) {
ret i1 %op
}
+define i1 @test6(i8 %x) {
+; CHECK-LABEL: define i1 @test6(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[TRUNC:%.*]] = icmp eq i8 [[X]], 2
+; CHECK-NEXT: ret i1 [[TRUNC]]
+;
+ %lshr = lshr i8 4, %x
+ %trunc = trunc i8 %lshr to i1
+ ret i1 %trunc
+}
+
+define i1 @test7(i8 %x) {
+; CHECK-LABEL: define i1 @test7(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[TRUNC:%.*]] = icmp ult i8 [[X]], 4
+; CHECK-NEXT: ret i1 [[TRUNC]]
+;
+ %lshr = lshr i8 15, %x
+ %trunc = trunc i8 %lshr to i1
+ ret i1 %trunc
+}
|
3aadd9b to
766e5f7
Compare
|
the folds also seems to hold for ashr see https://alive2.llvm.org/ce/z/Dm5HEp |
766e5f7 to
c51f8fb
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
9a22aa8 to
96b964e
Compare
andjo403
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me but wait for an other review
|
@zyw-bot mfuzz |
96b964e to
7c9465f
Compare
7c9465f to
1f7e7d4
Compare
dtcxzyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
As a follow-up, are you interested in folding trunc (shr C=0b11111...0000, %x) to i1 -> icmp ugt %x, cttz(C) - 1 as well? Not sure if it is profitable in real-world programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Value *Right = ConstantInt::get(V1->getType(), (*C1 + 1).countr_zero()); | |
| Value *Right = ConstantInt::get(V1->getType(), C1->countr_one()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
Transformation is done iff PowerOf2(C) || PowerOf2(C + 1)
|
@kper Please avoid force-pushing your branch if unnecessary. |
1f7e7d4 to
50fa45b
Compare
ah ok, sry |
Yeah sure :) |
…157998) Follow up of #157030 ``` trunc ( lshr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2 trunc ( ashr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2 ``` General proof: lshr: https://alive2.llvm.org/ce/z/vVfaJc ashr: https://alive2.llvm.org/ce/z/8aAcgD
…> x) to i1 (#157998) Follow up of llvm/llvm-project#157030 ``` trunc ( lshr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2 trunc ( ashr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2 ``` General proof: lshr: https://alive2.llvm.org/ce/z/vVfaJc ashr: https://alive2.llvm.org/ce/z/8aAcgD
Closes #156898
I have added two cases. The first one matches when the constant is exactly power of 2. The second case was to address the general case mentioned in the linked issue. I, however, did not really solve the general case.
We can only emit a
icmp ultif all the bits are one and that's only the case when the constant + 1 is a power of 2. Otherwise, we need to createicmp eqfor every bit that is one.Here are a few examples which won't be working with the two cases:
9: https://alive2.llvm.org/ce/z/S5FLJZ56: https://alive2.llvm.org/ce/z/yn_ZNG