-
Notifications
You must be signed in to change notification settings - Fork 15k
[AArch64][CostModel] Add constraints on which partial reductions are #163728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5661,6 +5661,9 @@ InstructionCost AArch64TTIImpl::getPartialReductionCost( | |
| AccumType->getScalarSizeInBits() / InputTypeA->getScalarSizeInBits(); | ||
| if (VF.getKnownMinValue() <= Ratio) | ||
| return Invalid; | ||
| // i32 -> i64 or i16 -> i32 is not natively supported on Neon and SVE. | ||
| if (Ratio < 4) | ||
|
||
| return Invalid; | ||
|
|
||
| VectorType *InputVectorType = VectorType::get(InputTypeA, VF); | ||
| VectorType *AccumVectorType = | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't true because SVE2.1 has support for
udot z0.s, z1.h, z2.hThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we can in theory lower code for a ratio of 2 even for NEON or SVE. It just might not be optimal codegen. Perhaps the cost modelling below needs modifying to handle the case you are worried about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah , missed this
Will try if I can cost model the codegen here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.