-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[AArch64] Improve operand sinking for mul instructions #116604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
f3a58f2
343168a
265694a
9c82902
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,14 +10,18 @@ target triple = "aarch64-unknown-linux-gnu" | |
| define dso_local i32 @dupext_crashtest(i32 %e) local_unnamed_addr { | ||
| ; CHECK-LABEL: dupext_crashtest: | ||
| ; CHECK: // %bb.0: // %for.body.lr.ph | ||
| ; CHECK-NEXT: mov w8, w0 | ||
| ; CHECK-NEXT: dup v0.2s, w8 | ||
| ; CHECK-NEXT: .LBB0_1: // %vector.body | ||
| ; CHECK-NEXT: // =>This Inner Loop Header: Depth=1 | ||
| ; CHECK-NEXT: ldr d1, [x8] | ||
| ; CHECK-NEXT: smull v1.2d, v0.2s, v1.2s | ||
| ; CHECK-NEXT: xtn v1.2s, v1.2d | ||
| ; CHECK-NEXT: str d1, [x8] | ||
| ; CHECK-NEXT: ldr d0, [x8] | ||
| ; CHECK-NEXT: ushll v0.2d, v0.2s, #0 | ||
| ; CHECK-NEXT: fmov x9, d0 | ||
| ; CHECK-NEXT: mov x8, v0.d[1] | ||
| ; CHECK-NEXT: mul w9, w0, w9 | ||
| ; CHECK-NEXT: mul w8, w0, w8 | ||
| ; CHECK-NEXT: fmov d0, x9 | ||
| ; CHECK-NEXT: mov v0.d[1], x8 | ||
| ; CHECK-NEXT: xtn v0.2s, v0.2d | ||
| ; CHECK-NEXT: str d0, [x8] | ||
|
Comment on lines
-17
to
+24
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a regression, but I have a patch that fixes it by teaching to handle
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. Is it possible to write a separate test for it too, with the anyext already in place?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seemed to make sense to put this into a seperate, follow-up PR - see #118308
I've added the test |
||
| ; CHECK-NEXT: b .LBB0_1 | ||
| for.body.lr.ph: | ||
| %conv314 = zext i32 %e to i64 | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add handling for v16i16 and similar larger types too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - I've refactored it to sink any (non-scalable) vector type with i16 or i32 elements, rather than adding all the possible element counts, because that seemed to make more sense - I'm not sure if there's a reason not to do it this way?