-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[AArch64] Guard for 128bit vectors in mull combine. #169839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The test case generates a extract_subvector(index) leading into a mul. Make sure we don't try and treat the scalable vector extract as a 128bit vector in the mull combine. Fixes llvm#168912
|
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesThe test case generates a extract_subvector(index) leading into a mul. Make sure we don't try and treat the scalable vector extract as a 128bit vector in the mull combine. Fixes #168912 Full diff: https://github.com/llvm/llvm-project/pull/169839.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index dd70d729ffc91..548cca33e9c40 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -5795,8 +5795,10 @@ SDValue AArch64TargetLowering::LowerMUL(SDValue Op, SelectionDAG &DAG) const {
if (VT.is64BitVector()) {
if (N0.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
isNullConstant(N0.getOperand(1)) &&
+ N0.getOperand(0).getValueType().is128BitVector() &&
N1.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
- isNullConstant(N1.getOperand(1))) {
+ isNullConstant(N1.getOperand(1)) &&
+ N1.getOperand(0).getValueType().is128BitVector()) {
N0 = N0.getOperand(0);
N1 = N1.getOperand(0);
VT = N0.getValueType();
diff --git a/llvm/test/CodeGen/AArch64/neon-extadd-extract.ll b/llvm/test/CodeGen/AArch64/neon-extadd-extract.ll
index 64cb3603f53a1..5753798e87512 100644
--- a/llvm/test/CodeGen/AArch64/neon-extadd-extract.ll
+++ b/llvm/test/CodeGen/AArch64/neon-extadd-extract.ll
@@ -771,3 +771,31 @@ entry:
%m = mul <1 x i64> %s0, %t1
ret <1 x i64> %m
}
+
+define <2 x i8> @extract_scalable_vec() vscale_range(1,16) "target-features"="+sve" {
+; CHECK-SD-LABEL: extract_scalable_vec:
+; CHECK-SD: // %bb.0: // %entry
+; CHECK-SD-NEXT: mov x8, xzr
+; CHECK-SD-NEXT: index z1.s, #2, #3
+; CHECK-SD-NEXT: ldr h0, [x8]
+; CHECK-SD-NEXT: ushll v0.8h, v0.8b, #0
+; CHECK-SD-NEXT: ushll v0.4s, v0.4h, #0
+; CHECK-SD-NEXT: mul v0.2s, v0.2s, v1.2s
+; CHECK-SD-NEXT: ret
+;
+; CHECK-GI-LABEL: extract_scalable_vec:
+; CHECK-GI: // %bb.0: // %entry
+; CHECK-GI-NEXT: mov x8, xzr
+; CHECK-GI-NEXT: mov x9, #1 // =0x1
+; CHECK-GI-NEXT: ld1 { v0.b }[0], [x8]
+; CHECK-GI-NEXT: ldr b1, [x9]
+; CHECK-GI-NEXT: adrp x8, .LCPI36_0
+; CHECK-GI-NEXT: mov v0.s[1], v1.s[0]
+; CHECK-GI-NEXT: ldr d1, [x8, :lo12:.LCPI36_0]
+; CHECK-GI-NEXT: mul v0.2s, v0.2s, v1.2s
+; CHECK-GI-NEXT: ret
+entry:
+ %0 = load <2 x i8>, ptr null, align 2
+ %mul = mul <2 x i8> %0, <i8 2, i8 5>
+ ret <2 x i8> %mul
+}
|
cofibrant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| N1.getOpcode() == ISD::EXTRACT_SUBVECTOR && | ||
| isNullConstant(N1.getOperand(1))) { | ||
| isNullConstant(N1.getOperand(1)) && | ||
| N1.getOperand(0).getValueType().is128BitVector()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahah--makes sense!
|
Thanks. |
The test case generates a extract_subvector(index) leading into a mul. Make sure we don't try and treat the scalable vector extract as a 128bit vector in the mull combine. Fixes llvm#168912
The test case generates a extract_subvector(index) leading into a mul. Make sure we don't try and treat the scalable vector extract as a 128bit vector in the mull combine. Fixes llvm#168912
The test case generates a extract_subvector(index) leading into a mul. Make sure we don't try and treat the scalable vector extract as a 128bit vector in the mull combine. Fixes llvm#168912
The test case generates a extract_subvector(index) leading into a mul. Make sure we don't try and treat the scalable vector extract as a 128bit vector in the mull combine.
Fixes #168912