-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[LV] Vectorize selecting last IV of min/max element. #141431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 28 commits
11870a2
2688d03
26ebf5a
af2ba25
2871d6c
ae18690
4305caf
98fb1f7
dc59607
6134ef1
ad99496
5b65693
fabcf69
844c2c2
df7e6b8
5e209db
127da7d
603c47c
76e661a
d74939e
a057dd3
c351e55
fd90ad9
2fd21ec
da55075
5a442b2
a78311d
87325fd
918f079
6cc3953
3cedf8a
b1ff1a4
895baa8
a073c9b
6d8e164
0671371
450d6a0
cb25d0c
7d34974
7710b71
7e2d9c3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -214,6 +214,52 @@ static bool checkOrderedReduction(RecurKind Kind, Instruction *ExactFPMathInst, | |||||||
| return true; | ||||||||
| } | ||||||||
|
|
||||||||
| /// Returns true if \p Phi is a min/max reduction matching \p Kind where \p Phi | ||||||||
| /// is used in the loop outside the reduction chain. This is common for loops | ||||||||
| /// selecting the index of a minimum/maximum value (argmin/argmax). | ||||||||
| static bool isMinMaxReductionWithLoopUsersOutsideReductionChain( | ||||||||
| PHINode *Phi, RecurKind Kind, Loop *TheLoop, RecurrenceDescriptor &RedDes) { | ||||||||
| BasicBlock *Latch = TheLoop->getLoopLatch(); | ||||||||
| if (!Latch) | ||||||||
| return false; | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added, thansk |
||||||||
|
|
||||||||
| assert(Phi->getNumIncomingValues() == 2 && "phi must have 2 incoming values"); | ||||||||
| Value *Inc = Phi->getIncomingValueForBlock(Latch); | ||||||||
| if (Phi->hasOneUse() || !Inc->hasOneUse() || | ||||||||
| !RecurrenceDescriptor::isIntMinMaxRecurrenceKind(Kind)) | ||||||||
| return false; | ||||||||
|
|
||||||||
| Value *A, *B; | ||||||||
| bool IsMinMax = [&]() { | ||||||||
| switch (Kind) { | ||||||||
| case RecurKind::UMax: | ||||||||
| return match(Inc, m_UMax(m_Value(A), m_Value(B))); | ||||||||
| case RecurKind::UMin: | ||||||||
| return match(Inc, m_UMin(m_Value(A), m_Value(B))); | ||||||||
| case RecurKind::SMax: | ||||||||
| return match(Inc, m_SMax(m_Value(A), m_Value(B))); | ||||||||
| case RecurKind::SMin: | ||||||||
| return match(Inc, m_SMin(m_Value(A), m_Value(B))); | ||||||||
| default: | ||||||||
| llvm_unreachable("all min/max kinds must be handled"); | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This checks for completeness, all integer min/max should be handled, and unreachable here should make it slightly easier to catch missing cases together with the check above |
||||||||
| } | ||||||||
| }(); | ||||||||
| if (!IsMinMax) | ||||||||
| return false; | ||||||||
|
|
||||||||
| if (A == B || (A != Phi && B != Phi)) | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: can be written as
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I left it as-is for now, thanks |
||||||||
| return false; | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This checks that one of the multiple users of Phi is Inc, what about checking that another user is in the loop (rather than live-out)?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The other users will be checked in VPlan currently.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, but the name of the function and its documentation specifically claim it has a user in the loop. Better rename Note that every reduction surely has users outside its chain, but they typically use the final post-updated value rather than the intermediate phi value. So
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yes, should be updated ,thanks |
||||||||
|
|
||||||||
| SmallPtrSet<Instruction *, 4> CastInsts; | ||||||||
| Value *RdxStart = Phi->getIncomingValueForBlock(TheLoop->getLoopPreheader()); | ||||||||
| RedDes = | ||||||||
| RecurrenceDescriptor(RdxStart, /*Exit=*/nullptr, /*Store=*/nullptr, Kind, | ||||||||
| FastMathFlags(), /*ExactFP=*/nullptr, Phi->getType(), | ||||||||
| /*Signed=*/false, /*Ordered=*/false, CastInsts, | ||||||||
| /*MinWidthCastToRecurTy=*/-1U, /*PhiMultiUse=*/true); | ||||||||
| return true; | ||||||||
| } | ||||||||
|
|
||||||||
| bool RecurrenceDescriptor::AddReductionVar( | ||||||||
| PHINode *Phi, RecurKind Kind, Loop *TheLoop, FastMathFlags FuncFMF, | ||||||||
| RecurrenceDescriptor &RedDes, DemandedBits *DB, AssumptionCache *AC, | ||||||||
|
|
@@ -225,6 +271,11 @@ bool RecurrenceDescriptor::AddReductionVar( | |||||||
| if (Phi->getParent() != TheLoop->getHeader()) | ||||||||
| return false; | ||||||||
|
|
||||||||
| // Check for min/max reduction variables that feed other users in the loop. | ||||||||
| if (isMinMaxReductionWithLoopUsersOutsideReductionChain(Phi, Kind, TheLoop, | ||||||||
|
||||||||
| if (isMinMaxReductionWithLoopUsersOutsideReductionChain(Phi, Kind, TheLoop, | |
| // Check for min/max reduction variables that feed other users in the loop. | |
| if (isMinMaxReductionWithLoopUsersOutsideReductionChain(Phi, Kind, TheLoop, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added thanks
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6589,6 +6589,11 @@ void LoopVectorizationCostModel::collectInLoopReductions() { | |
| PHINode *Phi = Reduction.first; | ||
| const RecurrenceDescriptor &RdxDesc = Reduction.second; | ||
|
|
||
| // Multi-use reductions (e.g., used in FindLastIV patterns) are handled | ||
| // separately and should not be considered for in-loop reductions. | ||
| if (RdxDesc.hasLoopUsesOutsideReductionChain()) | ||
| continue; | ||
|
|
||
| // We don't collect reductions that are type promoted (yet). | ||
| if (RdxDesc.getRecurrenceType() != Phi->getType()) | ||
| continue; | ||
|
|
@@ -7993,9 +7998,10 @@ void VPRecipeBuilder::collectScaledReductions(VFRange &Range) { | |
| MapVector<Instruction *, | ||
| SmallVector<std::pair<PartialReductionChain, unsigned>>> | ||
| ChainsByPhi; | ||
| for (const auto &[Phi, RdxDesc] : Legal->getReductionVars()) | ||
| getScaledReductions(Phi, RdxDesc.getLoopExitInstr(), Range, | ||
| ChainsByPhi[Phi]); | ||
| for (const auto &[Phi, RdxDesc] : Legal->getReductionVars()) { | ||
| if (Instruction *RdxExitInstr = RdxDesc.getLoopExitInstr()) | ||
| getScaledReductions(Phi, RdxExitInstr, Range, ChainsByPhi[Phi]); | ||
|
Comment on lines
+8007
to
+8008
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: may be good to wrap loop body in brackets.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done thanks |
||
| } | ||
|
|
||
| // A partial reduction is invalid if any of its extends are used by | ||
| // something that isn't another partial reduction. This is because the | ||
|
|
@@ -8212,7 +8218,8 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R, | |
| getScalingForReduction(RdxDesc.getLoopExitInstr()).value_or(1); | ||
| PhiRecipe = new VPReductionPHIRecipe( | ||
| Phi, RdxDesc.getRecurrenceKind(), *StartV, CM.isInLoopReduction(Phi), | ||
| CM.useOrderedReductions(RdxDesc), ScaleFactor); | ||
| CM.useOrderedReductions(RdxDesc), ScaleFactor, | ||
| RdxDesc.hasLoopUsesOutsideReductionChain()); | ||
| } else { | ||
| // TODO: Currently fixed-order recurrences are modeled as chains of | ||
| // first-order recurrences. If there are no users of the intermediate | ||
|
|
@@ -8542,6 +8549,11 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes( | |
| // Adjust the recipes for any inloop reductions. | ||
| adjustRecipesForReductions(Plan, RecipeBuilder, Range.Start); | ||
|
|
||
| // Apply mandatory transformation to handle reductions with multiple in-loop | ||
| // uses if possible, bail out otherwise. | ||
| if (!VPlanTransforms::runPass(VPlanTransforms::handleMultiUseReductions, | ||
| *Plan)) | ||
| return nullptr; | ||
| // Apply mandatory transformation to handle FP maxnum/minnum reduction with | ||
| // NaNs if possible, bail out otherwise. | ||
| if (!VPlanTransforms::runPass(VPlanTransforms::handleMaxMinNumReductions, | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -818,15 +818,18 @@ void VPlanTransforms::addMinimumVectorEpilogueIterationCheck( | |||||||
| Branch->setMetadata(LLVMContext::MD_prof, BranchWeights); | ||||||||
| } | ||||||||
|
|
||||||||
| /// If \p RedPhiR is used by a ComputeReductionResult recipe, return it. | ||||||||
| /// Otherwise return nullptr. | ||||||||
| static VPInstruction * | ||||||||
| findComputeReductionResult(VPReductionPHIRecipe *RedPhiR) { | ||||||||
| auto It = find_if(RedPhiR->users(), [](VPUser *U) { | ||||||||
| auto *VPI = dyn_cast<VPInstruction>(U); | ||||||||
| return VPI && VPI->getOpcode() == VPInstruction::ComputeReductionResult; | ||||||||
| }); | ||||||||
| return It == RedPhiR->user_end() ? nullptr : cast<VPInstruction>(*It); | ||||||||
| /// If \p V is used by a recipe matching pattern \p P, return it. Otherwise | ||||||||
| /// return nullptr; | ||||||||
| template <typename MatchT> | ||||||||
| static VPRecipeBase *findUserOf(VPValue *V, const MatchT &P) { | ||||||||
| auto It = find_if(V->users(), match_fn(P)); | ||||||||
| return It == V->user_end() ? nullptr : cast<VPRecipeBase>(*It); | ||||||||
| } | ||||||||
|
|
||||||||
| /// If \p V is used by a VPInstruction with \p Opcode, return it. Otherwise | ||||||||
| /// return nullptr. | ||||||||
| template <unsigned Opcode> static VPInstruction *findUserOf(VPValue *V) { | ||||||||
| return cast_or_null<VPInstruction>(findUserOf(V, m_VPInstruction<Opcode>())); | ||||||||
| } | ||||||||
|
|
||||||||
| bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) { | ||||||||
|
|
@@ -933,7 +936,8 @@ bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) { | |||||||
|
|
||||||||
| // If we exit early due to NaNs, compute the final reduction result based on | ||||||||
| // the reduction phi at the beginning of the last vector iteration. | ||||||||
| auto *RdxResult = findComputeReductionResult(RedPhiR); | ||||||||
| auto *RdxResult = | ||||||||
| findUserOf<VPInstruction::ComputeReductionResult>(RedPhiR); | ||||||||
|
|
||||||||
| auto *NewSel = MiddleBuilder.createSelect(AnyNaNLane, RedPhiR, | ||||||||
| RdxResult->getOperand(1)); | ||||||||
|
|
@@ -992,3 +996,98 @@ bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) { | |||||||
| MiddleTerm->setOperand(0, NewCond); | ||||||||
| return true; | ||||||||
| } | ||||||||
|
|
||||||||
| bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan) { | ||||||||
| for (auto &PhiR : make_early_inc_range( | ||||||||
| Plan.getVectorLoopRegion()->getEntryBasicBlock()->phis())) { | ||||||||
| auto *MinMaxPhiR = dyn_cast<VPReductionPHIRecipe>(&PhiR); | ||||||||
| // TODO: check for multi-uses in VPlan directly. | ||||||||
| if (!MinMaxPhiR || !MinMaxPhiR->hasLoopUsesOutsideReductionChain()) | ||||||||
| continue; | ||||||||
|
|
||||||||
| RecurKind RdxKind = MinMaxPhiR->getRecurrenceKind(); | ||||||||
| assert( | ||||||||
| RecurrenceDescriptor::isIntMinMaxRecurrenceKind(RdxKind) && | ||||||||
| "only min/max recurrences support users outside the reduction chain"); | ||||||||
|
|
||||||||
| // One user of MinMaxPhiR is MinMaxOp, the other user must be a compare | ||||||||
| // that's part of a FindLastIV chain. | ||||||||
| auto *MinMaxOp = | ||||||||
| dyn_cast<VPRecipeWithIRFlags>(MinMaxPhiR->getBackedgeValue()); | ||||||||
| if (!MinMaxOp || MinMaxOp->getNumUsers() != 2) | ||||||||
|
||||||||
| return false; | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added ,thanks |
||||||||
|
|
||||||||
| assert((isa<VPWidenIntrinsicRecipe>(MinMaxOp) || | ||||||||
| (isa<VPReplicateRecipe>(MinMaxOp) && | ||||||||
| isa<IntrinsicInst>( | ||||||||
| cast<VPReplicateRecipe>(MinMaxOp)->getUnderlyingValue()))) && | ||||||||
| "MinMaxOp must be a wide or scalar intrinsic"); | ||||||||
| VPValue *MinMaxOpA = MinMaxOp->getOperand(0); | ||||||||
| VPValue *MinMaxOpB = MinMaxOp->getOperand(1); | ||||||||
| if (MinMaxOpA != MinMaxPhiR) | ||||||||
| std::swap(MinMaxOpA, MinMaxOpB); | ||||||||
| if (MinMaxOpA != MinMaxPhiR) | ||||||||
| return false; | ||||||||
|
||||||||
|
|
||||||||
| VPValue *CmpOpA; | ||||||||
| VPValue *CmpOpB; | ||||||||
|
||||||||
| CmpPredicate Pred; | ||||||||
| auto *Cmp = dyn_cast_or_null<VPRecipeWithIRFlags>(findUserOf( | ||||||||
| MinMaxPhiR, m_Cmp(Pred, m_VPValue(CmpOpA), m_VPValue(CmpOpB)))); | ||||||||
| if (!Cmp || Cmp->getNumUsers() != 1 || | ||||||||
| (CmpOpA != MinMaxOpB && CmpOpB != MinMaxOpB)) | ||||||||
| return false; | ||||||||
|
|
||||||||
| // TODO: Strict predicates need to find the first IV value for which the | ||||||||
| // predicate holds, not the last. | ||||||||
| if (Pred == CmpInst::ICMP_EQ || Pred == CmpInst::ICMP_NE || | ||||||||
| ICmpInst::isLT(Pred) || ICmpInst::isGT(Pred)) | ||||||||
| return false; | ||||||||
|
|
||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth asserting that MinMaxOp aligns with Pred, i.e., both compute max or both compute min?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep this is now checked below, thanks |
||||||||
| // Normalize the predicate so MinMaxPhiR is on the right side. | ||||||||
| if (CmpOpA == MinMaxPhiR) | ||||||||
| Pred = CmpInst::getSwappedPredicate(Pred); | ||||||||
|
||||||||
|
|
||||||||
| // Cmp must be used by the select of a FindLastIV chain. | ||||||||
| VPValue *Sel = dyn_cast<VPSingleDefRecipe>(Cmp->getSingleUser()); | ||||||||
| VPValue *IVOp, *FindIV; | ||||||||
| if (!Sel || | ||||||||
| !match(Sel, | ||||||||
| m_Select(m_Specific(Cmp), m_VPValue(IVOp), m_VPValue(FindIV))) || | ||||||||
| Sel->getNumUsers() != 2) | ||||||||
| return false; | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added thanks |
||||||||
|
|
||||||||
| auto *FindIVPhiR = dyn_cast<VPReductionPHIRecipe>(FindIV); | ||||||||
| if (!FindIVPhiR || !RecurrenceDescriptor::isFindLastIVRecurrenceKind( | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If FindIVPhiR has a FindLastIVRecurrenceKind, does that imply IVOp must be a VPWidenIntOrFpInductionRecipe?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, moved check from above to assert below, thanks |
||||||||
| FindIVPhiR->getRecurrenceKind())) | ||||||||
| return false; | ||||||||
|
|
||||||||
| assert(isa<VPWidenIntOrFpInductionRecipe>(IVOp) && | ||||||||
| "IVOp must be a wide induction"); | ||||||||
| assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() && | ||||||||
| "cannot handle inloop/ordered reductions yet"); | ||||||||
|
|
||||||||
| // The reduction using MinMaxPhiR needs adjusting to compute the correct | ||||||||
| // result: | ||||||||
| // 1. We need to find the last IV for which the condition based on the | ||||||||
| // min/max recurrence is true, | ||||||||
| // 2. Compare the partial min/max reduction result to its final value and, | ||||||||
| // 3. Select the lanes of the partial FindLastIV reductions which | ||||||||
| // correspond to the lanes matching the min/max reduction result. | ||||||||
|
||||||||
| VPInstruction *FindIVResult = | ||||||||
| findUserOf<VPInstruction::ComputeFindIVResult>(FindIVPhiR); | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Independent: ComputeFindIVResult opcode missing documentation. |
||||||||
| VPInstruction *MinMaxResult = | ||||||||
| findUserOf<VPInstruction::ComputeReductionResult>(MinMaxPhiR); | ||||||||
|
||||||||
| MinMaxResult->moveBefore(*FindIVResult->getParent(), | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth asserting both have the same parent, expected to be the middle block?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added, thanks |
||||||||
| FindIVResult->getIterator()); | ||||||||
|
|
||||||||
| VPBuilder B(FindIVResult); | ||||||||
| auto *FinalMinMaxCmp = B.createICmp( | ||||||||
| CmpInst::ICMP_EQ, MinMaxResult->getOperand(1), MinMaxResult); | ||||||||
|
||||||||
| auto *FinalIVSelect = | ||||||||
| B.createSelect(FinalMinMaxCmp, FindIVResult->getOperand(3), | ||||||||
| FindIVResult->getOperand(2)); | ||||||||
|
||||||||
| FindIVResult->setOperand(3, FinalIVSelect); | ||||||||
| } | ||||||||
| return true; | ||||||||
| } | ||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe need to add assertion to check the Phi has 2 incoming values, one is from latch (already checked), and another is from preheader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks