-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[AArch64] Replace expensive move from wzr by two moves via floating point immediate #146538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
44db4f1
7be1083
dd61ed6
131c488
4b5089a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7356,16 +7356,10 @@ def : Pat<(v4f16 (vector_insert (v4f16 V64:$Rn), | |
(i64 0)), | ||
dsub)>; | ||
|
||
def : Pat<(vector_insert (v8f16 V128:$Rn), (f16 fpimm0), (i64 VectorIndexH:$imm)), | ||
(INSvi16gpr V128:$Rn, VectorIndexH:$imm, WZR)>; | ||
def : Pat<(vector_insert (v4f16 V64:$Rn), (f16 fpimm0), (i64 VectorIndexH:$imm)), | ||
(EXTRACT_SUBREG (INSvi16gpr (v8f16 (INSERT_SUBREG (v8f16 (IMPLICIT_DEF)), V64:$Rn, dsub)), VectorIndexH:$imm, WZR), dsub)>; | ||
def : Pat<(vector_insert (v4f32 V128:$Rn), (f32 fpimm0), (i64 VectorIndexS:$imm)), | ||
(INSvi32gpr V128:$Rn, VectorIndexS:$imm, WZR)>; | ||
def : Pat<(vector_insert (v2f32 V64:$Rn), (f32 fpimm0), (i64 VectorIndexS:$imm)), | ||
(EXTRACT_SUBREG (INSvi32gpr (v4f32 (INSERT_SUBREG (v4f32 (IMPLICIT_DEF)), V64:$Rn, dsub)), VectorIndexS:$imm, WZR), dsub)>; | ||
def : Pat<(vector_insert v2f64:$Rn, (f64 fpimm0), (i64 VectorIndexD:$imm)), | ||
(INSvi64gpr V128:$Rn, VectorIndexS:$imm, XZR)>; | ||
|
||
def : Pat<(v8f16 (vector_insert (v8f16 V128:$Rn), | ||
(f16 FPR16:$Rm), (i64 VectorIndexH:$imm))), | ||
|
@@ -8035,6 +8029,18 @@ def MOVIv2d_ns : SIMDModifiedImmVectorNoShift<1, 1, 0, 0b1110, V128, | |
"movi", ".2d", | ||
[(set (v2i64 V128:$Rd), (AArch64movi_edit imm0_255:$imm8))]>; | ||
|
||
def : Pat<(vector_insert (v8f16 V128:$Rn), (f16 fpimm0), (i64 VectorIndexH:$imm)), | ||
(INSvi16lane V128:$Rn, VectorIndexH:$imm, | ||
(v8f16 (MOVIv2d_ns (i32 0))), (i64 0))>; | ||
|
||
def : Pat<(vector_insert (v4f32 V128:$Rn), (f32 fpimm0), (i64 VectorIndexS:$imm)), | ||
(INSvi32lane V128:$Rn, VectorIndexS:$imm, | ||
(v4f32 (MOVIv2d_ns (i32 0))), (i64 0))>; | ||
|
||
def : Pat<(vector_insert (v2f64 V128:$Rn), (f64 fpimm0), (i64 VectorIndexD:$imm)), | ||
(INSvi64lane V128:$Rn, VectorIndexD:$imm, | ||
(v2f64 (MOVIv2d_ns (i32 0))), (i64 0))>; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it need these new patterns? Or is the codegen without any pattern already OK? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, sorry not sure I understand, without those patterns we get the move from wzr (which we want to avoid). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean - If we remove the INS wzr patterns above, do we need the new patterns or is that already handled by the existing "insert" and "zero a register" patterns? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ohh, I see, it is, yes :) |
||
|
||
let Predicates = [HasNEON] in { | ||
def : Pat<(v2i64 immAllZerosV), (MOVIv2d_ns (i32 0))>; | ||
def : Pat<(v4i32 immAllZerosV), (MOVIv2d_ns (i32 0))>; | ||
|
Uh oh!
There was an error while loading. Please reload this page.