Skip to content

Conversation

@davemgreen
Copy link
Collaborator

As with a normal ISD::SRA node, they take the number of sign bits of the incoming value and increase it by the shifted amount.

As with a normal ISD::SRA node, they take the number of sign bits of the
incoming value and increase it by the shifted amount.
@llvmbot
Copy link
Member

llvmbot commented Oct 28, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

As with a normal ISD::SRA node, they take the number of sign bits of the incoming value and increase it by the shifted amount.


Full diff: https://github.com/llvm/llvm-project/pull/113957.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+5)
  • (modified) llvm/test/CodeGen/AArch64/arm64-vshift.ll (+12)
  • (modified) llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp (+13)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 32ba2866ac8180..31a720ed7b5c77 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2536,6 +2536,11 @@ unsigned AArch64TargetLowering::ComputeNumSignBitsForTargetNode(
     case AArch64ISD::FCMLTz:
       // Compares return either 0 or all-ones
       return VTBits;
+    case AArch64ISD::VASHR: {
+      unsigned Tmp =
+          DAG.ComputeNumSignBits(Op.getOperand(0), DemandedElts, Depth + 1);
+      return std::min<uint64_t>(Tmp + Op.getConstantOperandVal(1), VTBits);
+    }
   }
 
   return 1;
diff --git a/llvm/test/CodeGen/AArch64/arm64-vshift.ll b/llvm/test/CodeGen/AArch64/arm64-vshift.ll
index 1dfd977186b0e7..7af7c235f9ac16 100644
--- a/llvm/test/CodeGen/AArch64/arm64-vshift.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-vshift.ll
@@ -3560,4 +3560,16 @@ entry:
   ret <4 x i16> %vrshrn_n1
 }
 
+define <8 x i16> @signbits_vashr(<8 x i16> %a)  {
+; CHECK-LABEL: signbits_vashr:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    sshr.8h v0, v0, #8
+; CHECK-NEXT:    sshr.8h v0, v0, #9
+; CHECK-NEXT:    ret
+  %b = call <8 x i16> @llvm.aarch64.neon.sshl.v8i16(<8 x i16> %a, <8 x i16> <i16 -8, i16 -8, i16 -8, i16 -8, i16 -8, i16 -8, i16 -8, i16 -8>)
+  %c = call <8 x i16> @llvm.aarch64.neon.sshl.v8i16(<8 x i16> %b, <8 x i16> <i16 -9, i16 -9, i16 -9, i16 -9, i16 -9, i16 -9, i16 -9, i16 -9>)
+  %d = ashr <8 x i16> %c, <i16 7, i16 7, i16 7, i16 7, i16 7, i16 7, i16 7, i16 7>
+  ret <8 x i16> %d
+}
+
 declare <2 x i64> @llvm.aarch64.neon.addp.v2i64(<2 x i64>, <2 x i64>)
diff --git a/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp b/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
index 3df72ec8115b6a..ffedb2c74220f0 100644
--- a/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
+++ b/llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
@@ -6,6 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "../lib/Target/AArch64/AArch64ISelLowering.h"
 #include "llvm/Analysis/MemoryLocation.h"
 #include "llvm/Analysis/OptimizationRemarkEmitter.h"
 #include "llvm/AsmParser/Parser.h"
@@ -167,6 +168,18 @@ TEST_F(AArch64SelectionDAGTest, ComputeNumSignBits_EXTRACT_SUBVECTOR) {
   EXPECT_EQ(DAG->ComputeNumSignBits(Op, DemandedElts), 7u);
 }
 
+TEST_F(AArch64SelectionDAGTest, ComputeNumSignBits_VASHR) {
+  SDLoc Loc;
+  auto VecVT = MVT::v8i8;
+  auto Shift = DAG->getConstant(4, Loc, MVT::i32);
+  auto Vec0 = DAG->getConstant(1, Loc, VecVT);
+  auto Op1 = DAG->getNode(AArch64ISD::VASHR, Loc, VecVT, Vec0, Shift);
+  EXPECT_EQ(DAG->ComputeNumSignBits(Op1), 8u);
+  auto VecA = DAG->getConstant(0xaa, Loc, VecVT);
+  auto Op2 = DAG->getNode(AArch64ISD::VASHR, Loc, VecVT, VecA, Shift);
+  EXPECT_EQ(DAG->ComputeNumSignBits(Op2), 5u);
+}
+
 TEST_F(AArch64SelectionDAGTest, SimplifyDemandedVectorElts_EXTRACT_SUBVECTOR) {
   TargetLowering TL(*TM);
 

; CHECK-LABEL: signbits_vashr:
; CHECK: // %bb.0:
; CHECK-NEXT: sshr.8h v0, v0, #8
; CHECK-NEXT: sshr.8h v0, v0, #9
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can quickly help me with this, I was expecting 3 shifts here, I guess that's the optimisation here but I don't see it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, without the patch it would be 3 shifts. With it we realize that if you shift by a total of > bitwidth (or have signed bits > bitwidth), then the value is all-zero or all-one and shifting further doesn't alter the result.
I was trying to come up with a test - this one is a little artificial. This should help with some improvements to smull I was trying to make.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, of course, that's it, cheers.

Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

; CHECK-LABEL: signbits_vashr:
; CHECK: // %bb.0:
; CHECK-NEXT: sshr.8h v0, v0, #8
; CHECK-NEXT: sshr.8h v0, v0, #9
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, of course, that's it, cheers.

@davemgreen davemgreen merged commit 83ae171 into llvm:main Oct 29, 2024
8 of 10 checks passed
@davemgreen davemgreen deleted the gh-a64-vashrsignbits branch October 29, 2024 21:02
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
As with a normal ISD::SRA node, they take the number of sign bits of the
incoming value and increase it by the shifted amount.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants