-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[VPlan] Implement VPWidenLoad/StoreEVLRecipe::computeCost(). #109644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -2464,6 +2464,43 @@ void VPWidenStoreEVLRecipe::execute(VPTransformState &State) { | |||||||||||||||||||||||||||
| State.addMetadata(NewSI, SI); | ||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| InstructionCost VPWidenStoreEVLRecipe::computeCost(ElementCount VF, | ||||||||||||||||||||||||||||
| VPCostContext &Ctx) const { | ||||||||||||||||||||||||||||
| Type *Ty = ToVectorTy(getLoadStoreType(&Ingredient), VF); | ||||||||||||||||||||||||||||
| const Align Alignment = | ||||||||||||||||||||||||||||
| getLoadStoreAlignment(const_cast<Instruction *>(&Ingredient)); | ||||||||||||||||||||||||||||
| unsigned AS = | ||||||||||||||||||||||||||||
| getLoadStoreAddressSpace(const_cast<Instruction *>(&Ingredient)); | ||||||||||||||||||||||||||||
| TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput; | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| if (!Consecutive) { | ||||||||||||||||||||||||||||
| // TODO: Using the original IR may not be accurate. | ||||||||||||||||||||||||||||
| // Currently, ARM will use the underlying IR to calculate gather/scatter | ||||||||||||||||||||||||||||
| // instruction cost. | ||||||||||||||||||||||||||||
| const Value *Ptr = getLoadStorePointerOperand(&Ingredient); | ||||||||||||||||||||||||||||
| assert(!Reverse && | ||||||||||||||||||||||||||||
| "Inconsecutive memory access should not have the order."); | ||||||||||||||||||||||||||||
| return Ctx.TTI.getAddressComputationCost(Ty) + | ||||||||||||||||||||||||||||
| Ctx.TTI.getGatherScatterOpCost(Ingredient.getOpcode(), Ty, Ptr, | ||||||||||||||||||||||||||||
| IsMasked, Alignment, CostKind, | ||||||||||||||||||||||||||||
| &Ingredient); | ||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| InstructionCost Cost = 0; | ||||||||||||||||||||||||||||
| // We need to use the getMaskedMemoryOpCost() instead of getMemoryOpCost() | ||||||||||||||||||||||||||||
| // here because the EVL recipes using EVL to replace the tail mask. But in the | ||||||||||||||||||||||||||||
| // legacy model, it will always calculate the cost of mask. | ||||||||||||||||||||||||||||
| // TODO: Using getMemoryOpCost() instead of getMaskedMemoryOpCost when we | ||||||||||||||||||||||||||||
| // don't need to care the legacy cost model. | ||||||||||||||||||||||||||||
| Cost += Ctx.TTI.getMaskedMemoryOpCost(Ingredient.getOpcode(), Ty, Alignment, | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
| InstructionCost Cost = 0; | |
| // We need to use the getMaskedMemoryOpCost() instead of getMemoryOpCost() | |
| // here because the EVL recipes using EVL to replace the tail mask. But in the | |
| // legacy model, it will always calculate the cost of mask. | |
| // TODO: Using getMemoryOpCost() instead of getMaskedMemoryOpCost when we | |
| // don't need to care the legacy cost model. | |
| Cost += Ctx.TTI.getMaskedMemoryOpCost(Ingredient.getOpcode(), Ty, Alignment, | |
| // We need to use the getMaskedMemoryOpCost() instead of getMemoryOpCost() | |
| // here because the EVL recipes using EVL to replace the tail mask. But in the | |
| // legacy model, it will always calculate the cost of mask. | |
| // TODO: Using getMemoryOpCost() instead of getMaskedMemoryOpCost when we | |
| // don't need to care the legacy cost model. | |
| InstructionCost Cost = Ctx.TTI.getMaskedMemoryOpCost(Ingredient.getOpcode(), Ty, Alignment, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks.
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,25 @@ | ||||||
| ; RUN: opt < %s --prefer-predicate-over-epilogue=predicate-dont-vectorize --passes=loop-vectorize -mcpu=sifive-p470 -mattr=+v,+f | ||||||
| ; RUN: opt < %s --prefer-predicate-over-epilogue=predicate-dont-vectorize --passes=loop-vectorize -mcpu=sifive-p470 -mattr=+v,+f -force-tail-folding-style=data-with-evl | ||||||
|
||||||
| ; Generated from issue #109468. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add an explanation to the test. IIUC the important bit is that the store doesn't need a mask with EVL?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added thanks. |
||||||
|
|
||||||
| target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128" | ||||||
| target triple = "riscv64-unknown-linux-gnu" | ||||||
|
|
||||||
| define void @lshift_significand(i32 %n, ptr nocapture writeonly %0) local_unnamed_addr #0 { | ||||||
|
||||||
| define void @lshift_significand(i32 %n, ptr nocapture writeonly %0) local_unnamed_addr #0 { | |
| define void @evl_store_cost(i32 %n, ptr nocapture writeonly %dst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed and removed, thanks.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for.body9: ; preds = %entry, %for.body9 | |
| loop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| %indvars.iv = phi i64 [ %spec.select, %entry ], [ %indvars.iv.next, %for.body9 ] | |
| %iv = phi i64 [ %spec.select, %entry ], [ %indvars.iv.next, %for.body9 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed, thanks.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| %arrayidx13 = getelementptr [3 x i64], ptr %0, i64 0, i64 %1 | |
| %arrayidx13 = getelementptr i64, ptr %0, i64 %1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for.end16: ; preds = %for.body9 | |
| exit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use cost from base class if possible?
If so, sink variable assignments closer to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Reuse the
VPWidenMemoryRecipe::computeCost()when the load/store is not consecutive or masked.