-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[WebAssembly] Optimize away mask of 63 for shl ( zext (and i32 63))) #152397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-webassembly Author: Jasmine Tang (badumbatish) ChangesFixes #71844 Full diff: https://github.com/llvm/llvm-project/pull/152397.diff 2 Files Affected:
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index 3f80b2ab2bd6d..325b01eccf67d 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -216,6 +216,7 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
setTargetDAGCombine(ISD::TRUNCATE);
+ setTargetDAGCombine(ISD::SHL);
// Support saturating add/sub for i8x16 and i16x8
for (auto Op : {ISD::SADDSAT, ISD::UADDSAT, ISD::SSUBSAT, ISD::USUBSAT})
for (auto T : {MVT::v16i8, MVT::v8i16})
@@ -3562,6 +3563,21 @@ static SDValue performMulCombine(SDNode *N,
{0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30});
}
+static SDValue performSHLCombine(SDNode *N, SelectionDAG &DAG) {
+ assert(N->getOpcode() == ISD::SHL);
+ if (N->getValueType(0) != MVT::i64)
+ return SDValue();
+ using namespace llvm::SDPatternMatch;
+ SDValue A, B;
+ APInt I;
+ if (sd_match(N,
+ m_Shl(m_Value(A), m_ZExt(m_And(m_Value(B), m_ConstInt(I)))))) {
+ if (I.getSExtValue() == 63)
+ return DAG.getNode(ISD::SHL, SDLoc(N), MVT::i64, {A, B});
+ }
+ return SDValue();
+}
+
SDValue
WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N,
DAGCombinerInfo &DCI) const {
@@ -3597,5 +3613,7 @@ WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N,
}
case ISD::MUL:
return performMulCombine(N, DCI);
+ case ISD::SHL:
+ return performSHLCombine(N, DCI.DAG);
}
}
diff --git a/llvm/test/CodeGen/WebAssembly/masked-shifts.ll b/llvm/test/CodeGen/WebAssembly/masked-shifts.ll
index 5bcb023e546b5..45c79df5f3f2b 100644
--- a/llvm/test/CodeGen/WebAssembly/masked-shifts.ll
+++ b/llvm/test/CodeGen/WebAssembly/masked-shifts.ll
@@ -18,6 +18,21 @@ define i32 @shl_i32(i32 %v, i32 %x) {
ret i32 %a
}
+define i64 @shl_i64_i32(i64 %v, i32 %x) {
+; CHECK-LABEL: shl_i64_i32:
+; CHECK: .functype shl_i64_i32 (i64, i32) -> (i64)
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: local.get 1
+; CHECK-NEXT: i64.extend_i32_u
+; CHECK-NEXT: i64.shl
+; CHECK-NEXT: # fallthrough-return
+ %m = and i32 %x, 63
+ %z = zext i32 %m to i64
+ %a = shl i64 %v, %z
+ ret i64 %a
+}
+
define i32 @sra_i32(i32 %v, i32 %x) {
; CHECK-LABEL: sra_i32:
; CHECK: .functype sra_i32 (i32, i32) -> (i32)
|
|
This can't be done in DAGCombine. If the shift amount for an ISD::SHL is larger than 63 the result is considered poison. The mask has to stay to keep the shift amount valid. Other targets that remove the mask due to it during isel. RISC-V for example uses a complex pattern that calls |
|
i can move this to tablegen but i dont quite understand this point - |
If the input to the AND is larger than 63, the mask will clear the upper bits to turn it into a value that is between 0 and 63. If you remove the mask without proving the upper bits are 0, then you might allow a value larger than 63 through to the shift. |
|
ohh i see, i'll redo the pr |
| Ret = DAG.getNOT(DL, Ret, MVT::i1); | ||
| return DAG.getZExtOrTrunc(Ret, DL, N->getValueType(0)); | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what changed in the clang formatting tools but i ran the following commands and this pops up
clang/tools/clang-format/git-clang-format --binary ./build/bin/clang-format HEAD~1
This reverts commit 40e9092 and replace it with a simple constant 63 mask
lukel97
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! If you're looking for a follow up PR, I think the i64 sra/srl patterns can get the same optimisation too right?
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/34089 Here is the relevant piece of the build log for the reference |
Fixes #71844