-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AArch64] Add ComputeNumSignBits for VASHR. #113957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As with a normal ISD::SRA node, they take the number of sign bits of the incoming value and increase it by the shifted amount.
|
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesAs with a normal ISD::SRA node, they take the number of sign bits of the incoming value and increase it by the shifted amount. Full diff: https://github.com/llvm/llvm-project/pull/113957.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 32ba2866ac8180..31a720ed7b5c77 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2536,6 +2536,11 @@ unsigned AArch64TargetLowering::ComputeNumSignBitsForTargetNode(
case AArch64ISD::FCMLTz:
// Compares return either 0 or all-ones
return VTBits;
+ case AArch64ISD::VASHR: {
+ unsigned Tmp =
+ DAG.ComputeNumSignBits(Op.getOperand(0), DemandedElts, Depth + 1);
+ return std::min<uint64_t>(Tmp + Op.getConstantOperandVal(1), VTBits);
+ }
}
return 1;
diff --git a/llvm/test/CodeGen/AArch64/arm64-vshift.ll b/llvm/test/CodeGen/AArch64/arm64-vshift.ll
index 1dfd977186b0e7..7af7c235f9ac16 100644
--- a/llvm/test/CodeGen/AArch64/arm64-vshift.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-vshift.ll
@@ -3560,4 +3560,16 @@ entry:
ret <4 x i16> %vrshrn_n1
}
+define <8 x i16> @signbits_vashr(<8 x i16> %a) {
+; CHECK-LABEL: signbits_vashr:
+; CHECK: // %bb.0:
+; CHECK-NEXT: sshr.8h v0, v0, #8
+; CHECK-NEXT: sshr.8h v0, v0, #9
+; CHECK-NEXT: ret
+ %b = call <8 x i16> @llvm.aarch64.neon.sshl.v8i16(<8 x i16> %a, <8 x i16> <i16 -8, i16 -8, i16 -8, i16 -8, i16 -8, i16 -8, i16 -8, i16 -8>)
+ %c = call <8 x i16> @llvm.aarch64.neon.sshl.v8i16(<8 x i16> %b, <8 x i16> <i16 -9, i16 -9, i16 -9, i16 -9, i16 -9, i16 -9, i16 -9, i16 -9>)
+ %d = ashr <8 x i16> %c, <i16 7, i16 7, i16 7, i16 7, i16 7, i16 7, i16 7, i16 7>
+ ret <8 x i16> %d
+}
+
declare <2 x i64> @llvm.aarch64.neon.addp.v2i64(<2 x i64>, <2 x i64>)
diff --git a/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp b/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
index 3df72ec8115b6a..ffedb2c74220f0 100644
--- a/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
+++ b/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
@@ -6,6 +6,7 @@
//
//===----------------------------------------------------------------------===//
+#include "../lib/Target/AArch64/AArch64ISelLowering.h"
#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/AsmParser/Parser.h"
@@ -167,6 +168,18 @@ TEST_F(AArch64SelectionDAGTest, ComputeNumSignBits_EXTRACT_SUBVECTOR) {
EXPECT_EQ(DAG->ComputeNumSignBits(Op, DemandedElts), 7u);
}
+TEST_F(AArch64SelectionDAGTest, ComputeNumSignBits_VASHR) {
+ SDLoc Loc;
+ auto VecVT = MVT::v8i8;
+ auto Shift = DAG->getConstant(4, Loc, MVT::i32);
+ auto Vec0 = DAG->getConstant(1, Loc, VecVT);
+ auto Op1 = DAG->getNode(AArch64ISD::VASHR, Loc, VecVT, Vec0, Shift);
+ EXPECT_EQ(DAG->ComputeNumSignBits(Op1), 8u);
+ auto VecA = DAG->getConstant(0xaa, Loc, VecVT);
+ auto Op2 = DAG->getNode(AArch64ISD::VASHR, Loc, VecVT, VecA, Shift);
+ EXPECT_EQ(DAG->ComputeNumSignBits(Op2), 5u);
+}
+
TEST_F(AArch64SelectionDAGTest, SimplifyDemandedVectorElts_EXTRACT_SUBVECTOR) {
TargetLowering TL(*TM);
|
| ; CHECK-LABEL: signbits_vashr: | ||
| ; CHECK: // %bb.0: | ||
| ; CHECK-NEXT: sshr.8h v0, v0, #8 | ||
| ; CHECK-NEXT: sshr.8h v0, v0, #9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can quickly help me with this, I was expecting 3 shifts here, I guess that's the optimisation here but I don't see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, without the patch it would be 3 shifts. With it we realize that if you shift by a total of > bitwidth (or have signed bits > bitwidth), then the value is all-zero or all-one and shifting further doesn't alter the result.
I was trying to come up with a test - this one is a little artificial. This should help with some improvements to smull I was trying to make.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course, that's it, cheers.
sjoerdmeijer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| ; CHECK-LABEL: signbits_vashr: | ||
| ; CHECK: // %bb.0: | ||
| ; CHECK-NEXT: sshr.8h v0, v0, #8 | ||
| ; CHECK-NEXT: sshr.8h v0, v0, #9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course, that's it, cheers.
As with a normal ISD::SRA node, they take the number of sign bits of the incoming value and increase it by the shifted amount.
As with a normal ISD::SRA node, they take the number of sign bits of the incoming value and increase it by the shifted amount.