Skip to content
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
11870a2
[LV] Vectorize select min/max index.
fhahn Jun 8, 2025
2688d03
!fixup address review comments, thanks
fhahn Jul 22, 2025
26ebf5a
!fixup remove stray new line
fhahn Aug 1, 2025
af2ba25
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Sep 11, 2025
2871d6c
!fixup fix build after merge
fhahn Sep 11, 2025
ae18690
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index…
fhahn Sep 11, 2025
4305caf
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index…
fhahn Sep 11, 2025
98fb1f7
!fixup detect multi-use min/max recurrences.
fhahn Sep 11, 2025
dc59607
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Oct 2, 2025
6134ef1
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Oct 2, 2025
ad99496
!fixup add additional uses.
fhahn Oct 2, 2025
5b65693
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Oct 6, 2025
fabcf69
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Oct 12, 2025
844c2c2
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Oct 26, 2025
df7e6b8
!fixup replace MultiUse kinds with boolean flag
fhahn Oct 26, 2025
5e209db
Merge branch 'main' into lv-find-min-max-index
fhahn Nov 3, 2025
127da7d
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 11, 2025
603c47c
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 19, 2025
76e661a
!fixup address comments, thanks
fhahn Nov 19, 2025
d74939e
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 20, 2025
a057dd3
!fixup address comments, thanks
fhahn Nov 20, 2025
c351e55
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 21, 2025
fd90ad9
!fixup address comments, thanks
fhahn Nov 21, 2025
2fd21ec
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 21, 2025
da55075
!fixup address latest comments, thanks
fhahn Nov 21, 2025
5a442b2
!fixup add argmin/argmax tests with fmin/fmax.
fhahn Nov 21, 2025
a78311d
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 22, 2025
87325fd
!fixup address latest comments, thanks
fhahn Nov 22, 2025
918f079
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 24, 2025
6cc3953
!fixup address latest comments, thanks
fhahn Nov 24, 2025
3cedf8a
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 25, 2025
b1ff1a4
!fixup address comments, thanks
fhahn Nov 25, 2025
895baa8
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index
fhahn Nov 26, 2025
a073c9b
!fixup
fhahn Nov 27, 2025
6d8e164
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index…
fhahn Nov 27, 2025
0671371
Step
fhahn Nov 27, 2025
450d6a0
[VPlan] Use m_Intrinsic to match assumes/noalias_scope_decl (NFC).
fhahn Nov 27, 2025
cb25d0c
[VPlan] Add matcher
fhahn Nov 27, 2025
7d34974
Fix crash
fhahn Nov 27, 2025
7710b71
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index…
fhahn Nov 27, 2025
7e2d9c3
Merge remote-tracking branch 'origin/main' into lv-find-min-max-index…
fhahn Nov 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions llvm/include/llvm/Analysis/IVDescriptors.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,12 +95,15 @@ class RecurrenceDescriptor {
RecurKind K, FastMathFlags FMF, Instruction *ExactFP,
Type *RT, bool Signed, bool Ordered,
SmallPtrSetImpl<Instruction *> &CI,
unsigned MinWidthCastToRecurTy)
unsigned MinWidthCastToRecurTy, bool PhiMultiUse = false)
: IntermediateStore(Store), StartValue(Start), LoopExitInstr(Exit),
Kind(K), FMF(FMF), ExactFPMathInst(ExactFP), RecurrenceType(RT),
IsSigned(Signed), IsOrdered(Ordered),
IsSigned(Signed), IsOrdered(Ordered), IsPhiMultiUse(PhiMultiUse),
MinWidthCastToRecurrenceType(MinWidthCastToRecurTy) {
CastInsts.insert_range(CI);
assert(
(!PhiMultiUse || isMinMaxRecurrenceKind(K)) &&
"Only min/max recurrences are allowed to have multiple uses currently");
}

/// This POD struct holds information about a potential recurrence operation.
Expand Down Expand Up @@ -339,6 +342,10 @@ class RecurrenceDescriptor {
/// Expose an ordered FP reduction to the instance users.
bool isOrdered() const { return IsOrdered; }

/// Returns true if the reduction PHI has multiple in-loop users. This is
/// relevant for min/max reductions that are part of a FindLastIV pattern.
bool isPhiMultiUse() const { return IsPhiMultiUse; }

/// Attempts to find a chain of operations from Phi to LoopExitInst that can
/// be treated as a set of reductions instructions for in-loop reductions.
LLVM_ABI SmallVector<Instruction *, 4> getReductionOpChain(PHINode *Phi,
Expand Down Expand Up @@ -376,6 +383,9 @@ class RecurrenceDescriptor {
// Currently only a non-reassociative FAdd can be considered in-order,
// if it is also the only FAdd in the PHI's use chain.
bool IsOrdered = false;
// True if the reduction PHI has multiple in-loop users. This is relevant
// for min/max reductions that are part of a FindLastIV pattern.
bool IsPhiMultiUse = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just call it isReduction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I went with specically the multi-use naming is that even with multiple uses of the phi, the phi ifself is still a reduction, although the multiple users prevent (or make more difficult) transformations. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps IsUsedInLoop or IsPhiReusedInLoop would be more accurate than IsPhiMultiUse. The phi is clearly used once in the loop being part of a recurrence, this indicates that it has another use in the loop - as opposed to being live-out.

An argmax/argmin pattern where a min/max reduction feeds a FindLastIV reduction may be indicated with something like IsFeedingAnotherReduction - as opposed to feeding say a non-reducing in-loop store (cf. a cumulative-sum pattern where a sum reduction feeds a non-reducing in-loop store). But here the additional in-loop user(s) has yet to be analyzed if it's reducing or not?

An argmax/argmin pattern conceptually has multiple users of it's compare condition rather than of its phi, as in:

  bool updateMax = currentValue > resultMax;
  resultMax = updateMax ? currentValue : resultMax;
  resultArg = updateMax ? currentIV : resultArg;

But if the argmax/argmin pattern uses a max/min intrinsic instead of a compare-select pair, the resultMax phi will have multiple users instead of the updateMax compare:

  bool updateMax = currentValue > resultMax;
  resultMax = max(currentValue, resultMax);
  resultArg = updateMax ? currentIV : resultArg;

Would be good to explain, and support both (follow-up)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, but I think "IsMultiUse" would be enough. If there is more than one operation in the recurrence chain, i.e., phi-> max -> max, not only phi will have user, but max in the middle will also have user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this doesn't seem precise enough. If it's a cmp-select form of min/max, it's normal for phi to have more than one user. It should more accurately refer to multiple users outside the recurrence chain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it to the more verbose PhiHasLoopUsesOutsideReductionChain. WDYT?

An argmax/argmin pattern where a min/max reduction feeds a FindLastIV reduction may be indicated with something like IsFeedingAnotherReduction - as opposed to feeding say a non-reducing in-loop store (cf. a cumulative-sum pattern where a sum reduction feeds a non-reducing in-loop store). But here the additional in-loop user(s) has yet to be analyzed if it's reducing or not?

Exactly, we delay checking the other users here, as it naturally will be handled by the existing analysis, and matching/legalization can easily be done in VPlan.

// Instructions used for type-promoting the recurrence.
SmallPtrSet<Instruction *, 8> CastInsts;
// The minimum width used by the recurrence.
Expand Down
38 changes: 38 additions & 0 deletions llvm/lib/Analysis/IVDescriptors.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,40 @@ static bool checkOrderedReduction(RecurKind Kind, Instruction *ExactFPMathInst,
return true;
}

static std::optional<RecurrenceDescriptor>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to document what it's optionally returning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation, thanks

getMultiUseMinMax(PHINode *Phi, RecurKind Kind, Loop *TheLoop) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not we modify the existing function for detecting min/max reductions to achieve this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably yes, but currently this extension would be a bit messy I think, as the current implementation of AddReductionVar combines checking for all reduction kinds together, and adding another state to check that only applies to a subset seems to make it even more complicated generally.

I plan to unify it more, if/when #163460 lands.

BasicBlock *Latch = TheLoop->getLoopLatch();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe need to add assertion to check the Phi has 2 incoming values, one is from latch (already checked), and another is from preheader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

if (!Latch)
return std::nullopt;
Value *Inc = Phi->getIncomingValueForBlock(Latch);
RecurKind RK;
if (Phi->hasOneUse() ||
!RecurrenceDescriptor::isIntMinMaxRecurrenceKind(Kind))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can slightly simplify by sinking to default and hoisting from m_OneUse below

Suggested change
if (Phi->hasOneUse() ||
!RecurrenceDescriptor::isIntMinMaxRecurrenceKind(Kind))
if (Phi->hasOneUse() || !Inc->hasOneUse())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks, but I retained the isIntMinMaxRecurrenceKind for now, to catch missing cases below

return std::nullopt;

Value *A, *B;
if (match(Inc, m_OneUse(m_UMin(m_Value(A), m_Value(B)))))
RK = RecurKind::UMin;
else if (match(Inc, m_OneUse(m_UMax(m_Value(A), m_Value(B)))))
RK = RecurKind::UMax;
else if (match(Inc, m_OneUse(m_SMax(m_Value(A), m_Value(B)))))
RK = RecurKind::SMax;
else if (match(Inc, m_OneUse(m_SMin(m_Value(A), m_Value(B)))))
RK = RecurKind::SMin;
else
return std::nullopt;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace by unreachable, having early-exited above if !isIntMinMaxRecurrenceKind(Kind), or remove that check above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above checks the requested RecurrenceKind, here we check the incoming value, but the code was a bit confusing and didn't account properly for the requested Kind. Updated to remove setting RK, and only use the appropriate matcher for Kind, thanks


if (A == B || (A != Phi && B != Phi))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can be written as !((A == Phi) ^ (B == Phi)) or ((A == Phi) == (B == Phi)), but probably clearer as written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it as-is for now, thanks

return std::nullopt;

SmallPtrSet<Instruction *, 4> CastInsts;
Value *RdxStart = Phi->getIncomingValueForBlock(TheLoop->getLoopPreheader());
RecurrenceDescriptor RD(RdxStart, nullptr, nullptr, RK, FastMathFlags(),
nullptr, Phi->getType(), false, false, CastInsts, -1U,
true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add /* arg name*/ for each nullptr and false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

return {RD};
}

bool RecurrenceDescriptor::AddReductionVar(
PHINode *Phi, RecurKind Kind, Loop *TheLoop, FastMathFlags FuncFMF,
RecurrenceDescriptor &RedDes, DemandedBits *DB, AssumptionCache *AC,
Expand All @@ -225,6 +259,10 @@ bool RecurrenceDescriptor::AddReductionVar(
if (Phi->getParent() != TheLoop->getHeader())
return false;

if (auto RD = getMultiUseMinMax(Phi, Kind, TheLoop)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a simple early-exit case of AddReductionVar(), should getMultiUseMinMax() also return bool and receive a reference of RecurrenceDescriptor as parameter, rather than return an optional? Name could also be more aligned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep updated the arguments, return value and naming, thanks

RedDes = *RD;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should ++NumCmpSelectPatternInst; be added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, this is just used locally to verify min/max patterns later on.

return true;
}
// Obtain the reduction start value from the value that comes from the loop
// preheader.
Value *RdxStart = Phi->getIncomingValueForBlock(TheLoop->getLoopPreheader());
Expand Down
4 changes: 3 additions & 1 deletion llvm/lib/Transforms/Utils/LoopUnroll.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1252,7 +1252,9 @@ llvm::canParallelizeReductionWhenUnrolling(PHINode &Phi, Loop *L,
RecurrenceDescriptor RdxDesc;
if (!RecurrenceDescriptor::isReductionPHI(&Phi, L, RdxDesc,
/*DemandedBits=*/nullptr,
/*AC=*/nullptr, /*DT=*/nullptr, SE))
/*AC=*/nullptr, /*DT=*/nullptr,
SE) ||
RdxDesc.isPhiMultiUse())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*AC=*/nullptr, /*DT=*/nullptr,
SE) ||
RdxDesc.isPhiMultiUse())
/*AC=*/nullptr, /*DT=*/nullptr, SE))
return std::nullopt;
if (RdxDesc.isPhiMultiUse())
return std::nullopt;

clearer to check isPhiMultiUse() after isReductionPHI() returned true and initialized RdxDesc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

return std::nullopt;
RecurKind RK = RdxDesc.getRecurrenceKind();
// Skip unsupported reductions.
Expand Down
5 changes: 5 additions & 0 deletions llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -878,6 +878,11 @@ bool LoopVectorizationLegality::canVectorizeInstr(Instruction &I) {
Requirements->addExactFPMathInst(RedDes.getExactFPMathInst());
AllowedExit.insert(RedDes.getLoopExitInstr());
Reductions[Phi] = RedDes;
assert((!RedDes.isPhiMultiUse() ||
RecurrenceDescriptor::isMinMaxRecurrenceKind(
RedDes.getRecurrenceKind())) &&
"Only min/max recurrences are allowed to have multiple uses "
"currently");
return true;
}

Expand Down
37 changes: 31 additions & 6 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6533,6 +6533,11 @@ void LoopVectorizationCostModel::collectInLoopReductions() {
PHINode *Phi = Reduction.first;
const RecurrenceDescriptor &RdxDesc = Reduction.second;

// Multi-use reductions (e.g., used in FindLastIV patterns) are handled
// separately and should not be considered for in-loop reductions.
if (RdxDesc.isPhiMultiUse())
continue;

// We don't collect reductions that are type promoted (yet).
if (RdxDesc.getRecurrenceType() != Phi->getType())
continue;
Expand Down Expand Up @@ -7184,6 +7189,9 @@ static void fixReductionScalarResumeWhenVectorizingEpilog(
Value *StartV = getStartValueFromReductionResult(EpiRedResult);
Value *SentinelV = EpiRedResult->getOperand(2)->getLiveInIRValue();
using namespace llvm::PatternMatch;
MainResumeValue = cast<VPInstruction>(EpiRedHeaderPhi->getStartValue())
->getOperand(0)
->getUnderlyingValue();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to understand why this addition is needed now(?), and maybe provide an overall explanation of setting MainResumeValue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left-over and is no longer needed, removed, thanks

Value *Cmp, *OrigResumeV, *CmpOp;
[[maybe_unused]] bool IsExpectedPattern =
match(MainResumeValue,
Expand All @@ -7194,7 +7202,11 @@ static void fixReductionScalarResumeWhenVectorizingEpilog(
((CmpOp == StartV && isGuaranteedNotToBeUndefOrPoison(CmpOp))));
assert(IsExpectedPattern && "Unexpected reduction resume pattern");
MainResumeValue = OrigResumeV;
} else if (auto *VPI =
dyn_cast<VPInstruction>(EpiRedHeaderPhi->getStartValue())) {
MainResumeValue = VPI->getOperand(0)->getUnderlyingValue();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why/Is this needed given that MainResumeValue is already first set to that underlying value under that condition at the earlier if above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left-over and is no longer needed, removed, thanks

}

PHINode *MainResumePhi = cast<PHINode>(MainResumeValue);

// When fixing reductions in the epilogue loop we should already have
Expand Down Expand Up @@ -7906,6 +7918,9 @@ void VPRecipeBuilder::collectScaledReductions(VFRange &Range) {
SmallVector<std::pair<PartialReductionChain, unsigned>>
PartialReductionChains;
for (const auto &[Phi, RdxDesc] : Legal->getReductionVars()) {
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(
RdxDesc.getRecurrenceKind()))
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this change be pre-applied independently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the min/max recurrences originally be collected for partial reduction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that for some min/max recurrences we now don't have an exit instruction (the min/max for argmax currently must be used inside the loop only. I updated this to check getLoopExitInstr for non-null explicitly

getScaledReductions(Phi, RdxDesc.getLoopExitInstr(), Range,
PartialReductionChains);
}
Expand Down Expand Up @@ -8070,9 +8085,6 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R,
return Recipe;

VPHeaderPHIRecipe *PhiRecipe = nullptr;
assert((Legal->isReductionVariable(Phi) ||
Legal->isFixedOrderRecurrence(Phi)) &&
"can only widen reductions and fixed-order recurrences here");
VPValue *StartV = Operands[0];
if (Legal->isReductionVariable(Phi)) {
const RecurrenceDescriptor &RdxDesc = Legal->getRecurrenceDescriptor(Phi);
Expand All @@ -8084,13 +8096,19 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R,
getScalingForReduction(RdxDesc.getLoopExitInstr()).value_or(1);
PhiRecipe = new VPReductionPHIRecipe(
Phi, RdxDesc.getRecurrenceKind(), *StartV, CM.isInLoopReduction(Phi),
CM.useOrderedReductions(RdxDesc), ScaleFactor);
} else {
CM.useOrderedReductions(RdxDesc), ScaleFactor,
RdxDesc.isPhiMultiUse());
} else if (Legal->isFixedOrderRecurrence(Phi)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This previously last else case assumed Phi was a FOR, i.e., could have asserted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has now been removed, not needed in the latest version, also restored the assert above

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If above if is added PhiRecipe may remain null and be dereferenced below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, remove the if again, as this was still left over

// TODO: Currently fixed-order recurrences are modeled as chains of
// first-order recurrences. If there are no users of the intermediate
// recurrences in the chain, the fixed order recurrence should be modeled
// directly, enabling more efficient codegen.
PhiRecipe = new VPFirstOrderRecurrencePHIRecipe(Phi, *StartV);
} else {
// Failed to identify phi as reduction or fixed-order recurrence. Keep the
// original VPWidenPHIRecipe for now, to be legalized later if possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable to assert this new last else case involves a max/min reduction w/ multiple uses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed with the latest version, removed, thanks

setRecipe(Phi, R);
return nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need this, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with the latest version, removed thanks

}
// Add backedge value.
PhiRecipe->addOperand(Operands[1]);
Expand Down Expand Up @@ -8365,8 +8383,11 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(

VPRecipeBase *Recipe =
RecipeBuilder.tryToCreateWidenRecipe(SingleDef, Range);
if (!Recipe)
if (!Recipe) {
if (isa<VPPhi>(SingleDef))
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth commenting how VPPhi recipes that are not widened are handled (later) rather than replicated (now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also not needed in the latest version, removed thanks

Recipe = RecipeBuilder.handleReplication(Instr, R.operands(), Range);
}

RecipeBuilder.setRecipe(Instr, Recipe);
if (isa<VPWidenIntOrFpInductionRecipe>(Recipe) && isa<TruncInst>(Instr)) {
Expand Down Expand Up @@ -8428,6 +8449,10 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
// Adjust the recipes for any inloop reductions.
adjustRecipesForReductions(Plan, RecipeBuilder, Range.Start);

// Try to legalize reductions with multiple in-loop uses.
if (!VPlanTransforms::runPass(VPlanTransforms::legalizeMultiUseReductions,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Try to legalize reductions with multiple in-loop uses.
if (!VPlanTransforms::runPass(VPlanTransforms::legalizeMultiUseReductions,
// Apply mandatory transformation to handle reductions with multiple in-loop
// uses if possible, bail out otherwise.
if (!VPlanTransforms::runPass(VPlanTransforms::handleMultiUseReductions,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

*Plan))
return nullptr;
// Apply mandatory transformation to handle FP maxnum/minnum reduction with
// NaNs if possible, bail out otherwise.
if (!VPlanTransforms::runPass(VPlanTransforms::handleMaxMinNumReductions,
Expand Down
23 changes: 18 additions & 5 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -1994,7 +1994,8 @@ class LLVM_ABI_FOR_TEST VPHeaderPHIRecipe : public VPSingleDefRecipe,
~VPHeaderPHIRecipe() override = default;

/// Method to support type inquiry through isa, cast, and dyn_cast.
static inline bool classof(const VPRecipeBase *B) {
static inline bool classof(const VPUser *U) {
auto *B = cast<VPRecipeBase>(U);
return B->getVPDefID() >= VPDef::VPFirstHeaderPHISC &&
B->getVPDefID() <= VPDef::VPLastHeaderPHISC;
}
Expand All @@ -2003,6 +2004,10 @@ class LLVM_ABI_FOR_TEST VPHeaderPHIRecipe : public VPSingleDefRecipe,
return B && B->getVPDefID() >= VPRecipeBase::VPFirstHeaderPHISC &&
B->getVPDefID() <= VPRecipeBase::VPLastHeaderPHISC;
}
static inline bool classof(const VPSingleDefRecipe *B) {
return B->getVPDefID() >= VPDef::VPFirstHeaderPHISC &&
B->getVPDefID() <= VPDef::VPLastHeaderPHISC;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this interval condition checking a recipe's VPDefID be implemented once and reused thrice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use classof(VPRecipeBase), thanks

}

/// Generate the phi nodes.
void execute(VPTransformState &State) override = 0;
Expand Down Expand Up @@ -2067,7 +2072,7 @@ class VPWidenInductionRecipe : public VPHeaderPHIRecipe {
return R && classof(R);
}

static inline bool classof(const VPHeaderPHIRecipe *R) {
static inline bool classof(const VPSingleDefRecipe *R) {
return classof(static_cast<const VPRecipeBase *>(R));
}

Expand Down Expand Up @@ -2344,6 +2349,10 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
/// The phi is part of an ordered reduction. Requires IsInLoop to be true.
bool IsOrdered;

/// The phi is part of a multi-use reduction (e.g., used in FindLastIV
/// patterns).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to note somewhere (TODO? Negative tests?) that a min/max reduction could alternatively feed a FindLastIV reduction by having the compare be used twice (by two selects) rather than the phi be used twice (by a select and min/max intrinsic).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be covered by negative test test_vectorize_select_umin_via_select_idx in llvm/test/Transforms/LoopVectorize/select-umin-last-index.ll and similar tests in other files.

Added a TODO, thanks

bool IsPhiMultiUse;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this property be calculated by checking how many in-loop users the header phi recipe has, is this caching of the answer a matter of performance (only).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although it would need implementing matching Cmp/Select.


/// When expanding the reduction PHI, the plan's VF element count is divided
/// by this factor to form the reduction phi's VF.
unsigned VFScaleFactor = 1;
Expand All @@ -2352,9 +2361,10 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
/// Create a new VPReductionPHIRecipe for the reduction \p Phi.
VPReductionPHIRecipe(PHINode *Phi, RecurKind Kind, VPValue &Start,
bool IsInLoop = false, bool IsOrdered = false,
unsigned VFScaleFactor = 1)
unsigned VFScaleFactor = 1, bool IsPhiMultiUse = false)
: VPHeaderPHIRecipe(VPDef::VPReductionPHISC, Phi, &Start), Kind(Kind),
IsInLoop(IsInLoop), IsOrdered(IsOrdered), VFScaleFactor(VFScaleFactor) {
IsInLoop(IsInLoop), IsOrdered(IsOrdered), IsPhiMultiUse(IsPhiMultiUse),
VFScaleFactor(VFScaleFactor) {
assert((!IsOrdered || IsInLoop) && "IsOrdered requires IsInLoop");
}

Expand All @@ -2363,7 +2373,7 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
VPReductionPHIRecipe *clone() override {
auto *R = new VPReductionPHIRecipe(
dyn_cast_or_null<PHINode>(getUnderlyingValue()), getRecurrenceKind(),
*getOperand(0), IsInLoop, IsOrdered, VFScaleFactor);
*getOperand(0), IsInLoop, IsOrdered, VFScaleFactor, IsPhiMultiUse);
R->addOperand(getBackedgeValue());
return R;
}
Expand Down Expand Up @@ -2396,6 +2406,9 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
/// Returns true, if the phi is part of an in-loop reduction.
bool isInLoop() const { return IsInLoop; }

/// Returns true, if the phi is part of a multi-use reduction.
bool isPhiMultiUse() const { return IsPhiMultiUse; }

/// Returns true if the recipe only uses the first lane of operand \p Op.
bool onlyFirstLaneUsed(const VPValue *Op) const override {
assert(is_contained(operands(), Op) &&
Expand Down
98 changes: 98 additions & 0 deletions llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -948,3 +948,101 @@ bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) {
MiddleTerm->setOperand(0, NewCond);
return true;
}

bool VPlanTransforms::legalizeMultiUseReductions(VPlan &Plan) {
for (auto &PhiR : make_early_inc_range(
Plan.getVectorLoopRegion()->getEntryBasicBlock()->phis())) {
auto *MinMaxPhiR = dyn_cast<VPReductionPHIRecipe>(&PhiR);
if (!MinMaxPhiR)
continue;

RecurKind RdxKind = MinMaxPhiR->getRecurrenceKind();
// TODO: check for multi-uses in VPlan directly.
if (!RecurrenceDescriptor::isIntMinMaxRecurrenceKind(RdxKind) ||
!MinMaxPhiR->isPhiMultiUse())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would checking the latter suffice, if it implies the former.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that should be sufficient. Update to check for the latter, assert to former

continue;

// One user of MinMaxPhiR is MinMaxOp, the other users must be a compare
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// One user of MinMaxPhiR is MinMaxOp, the other users must be a compare
// One user of MinMaxPhiR is MinMaxOp, the other user must be a compare

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

// that's part of a FindLastIV chain.
auto *MinMaxOp =
dyn_cast<VPRecipeWithIRFlags>(MinMaxPhiR->getBackedgeValue());
if (!MinMaxOp || MinMaxOp->getNumUsers() != 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiUse can be more accurately DoubleUse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, although I am not sure if we should tighten this down early on? A single min/max could serve multiple selects, supported here eventually?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple selects could conceptually be fed by a single min/max, but are there conceivable use cases? The core argmin/argmax pattern seems sufficiently prevalent and involved to recognize and handle, to arguably deserve specific attention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, now the multi use terminology has been replaced, would you prefer to make the initial check more specific?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better clarify what patterns are supported as accurately as possible at the outset.

Somewhat confusing - isMinMaxReductionWithLoopUsersOutsideReductionChain() checks that Phi has more than one user (in the loop) and Inc has one - here MinMaxOp (aka Inc?) has two? Should MinMaxOp be checked to have 2 operands but (asserted to have) 1 user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added a comment after the initial early continue in the loop.

I added the assert (and more asserts later below checking the expected users)

return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return false;
return false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added ,thanks

auto MinMaxUsers = to_vector(MinMaxPhiR->users());
auto *Cmp = dyn_cast<VPRecipeWithIRFlags>(
MinMaxUsers[0] == MinMaxOp ? MinMaxUsers[1] : MinMaxUsers[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should check the number of phi's users, like #141467 did:

  // TODO: support min/max with 2-D indexes.
  if (!Phi->hasNUses(2))
    return false;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use some form of findUser[W/ or W/O Opcode]() as in findComputeReductionResult()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably yes, but it get's a bit more complicated, as the compare could be different recipes (replicate/widen)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could find_if(users, != MinMaxOp) be used, as in findReductionUser(), for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use new findUser with pattern matcher

VPValue *CmpOpA;
VPValue *CmpOpB;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compress into one line, and choose the better name.

if (!Cmp || Cmp->getNumUsers() != 1 ||
!match(Cmp, m_Binary<Instruction::ICmp>(m_VPValue(CmpOpA),
m_VPValue(CmpOpB))))
return false;

// Normalize the predicate so MinMaxPhiR is on the right side.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that argmax corresponds to GE/GT and argmin corresponds to LE/LT (FindLastIV/FindFirstIV for both. respectively)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

CmpInst::Predicate Pred = Cmp->getPredicate();
if (CmpOpA == MinMaxPhiR)
Pred = CmpInst::getSwappedPredicate(Pred);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bail out now (regardless of RdxKind) if Pred is strict as only FindLastIV is supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep done, thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it important to normalize so MinMaxPhiR is on the right - Pred appears to be unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for the current version, but will be needed later to support more cases. I removed it for now, thanks


// Determine if the predicate is not strict.
bool IsNonStrictPred = ICmpInst::isLE(Pred) || ICmpInst::isGE(Pred);
// Account for a mis-match between RdxKind and the predicate.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Account for a mis-match between RdxKind and the predicate.
// Account for a mismatch between RdxKind and the predicate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole switch is now gone, thanks

switch (RdxKind) {
case RecurKind::UMin:
case RecurKind::SMin:
IsNonStrictPred |= ICmpInst::isGT(Pred);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bail out now if Pred isGE - matching (Last) argmax, mismatching RdxKind UMin/Smin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Covered by earlier bail out, remove switch, thanks

break;
case RecurKind::UMax:
case RecurKind::SMax:
IsNonStrictPred |= ICmpInst::isLT(Pred);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bail out now if Pred isLE - matching (Last) argmin, mismatching RdxKind UMax/Smax?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Covered by earlier bail out, remove switch, thanks

break;
default:
llvm_unreachable("unsupported kind");
}

// TODO: Strict predicates need to find the first IV value for which the
// predicate holds, not the last.
if (Pred == CmpInst::ICMP_NE || !IsNonStrictPred)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can bail out if Pred is NE/EQ earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bailed out earlier, thanks

return false;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth asserting that MinMaxOp aligns with Pred, i.e., both compute max or both compute min?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep this is now checked below, thanks

// Cmp must be used by the select of a FindLastIV chain.
VPValue *Sel = dyn_cast<VPSingleDefRecipe>(*Cmp->user_begin());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VPValue *Sel = dyn_cast<VPSingleDefRecipe>(*Cmp->user_begin());
auto *Sel = dyn_cast<VPSingleDefRecipe>(*Cmp->user_begin());

VPValue *IVOp, *FindIV;
if (!Sel ||
!match(Sel,
m_Select(m_Specific(Cmp), m_VPValue(IVOp), m_VPValue(FindIV))) ||
Sel->getNumUsers() != 2 || !isa<VPWidenIntOrFpInductionRecipe>(IVOp))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could IVOp be scalar/derived?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet at this point, as scalar-steps are introduced as optimization later

return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return false;
return false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added thanks

auto *FindIVPhiR = dyn_cast<VPReductionPHIRecipe>(FindIV);
if (!FindIVPhiR || !RecurrenceDescriptor::isFindLastIVRecurrenceKind(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If FindIVPhiR has a FindLastIVRecurrenceKind, does that imply IVOp must be a VPWidenIntOrFpInductionRecipe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, moved check from above to assert below, thanks

FindIVPhiR->getRecurrenceKind()))
return false;

assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() &&
"cannot handle inloop/ordered reductions yet");

// The reduction using MinMaxPhiR needs adjusting to compute the correct
// result:
// 1. We need to find the last IV for which the condition based on the
// min/max recurrence is true,
// 2. Compare the partial min/max reduction result to its final value and,
// 3. Select the lanes of the partial FindLastIV reductions which
// correspond to the lanes matching the min/max reduction result.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be helpful to show snippets of VPInstructions before and after these steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

VPInstruction *FindIVResult = dyn_cast<VPInstruction>(
*(Sel->user_begin() + (*Sel->user_begin() == FindIVPhiR ? 1 : 0)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a clearer and consistent way to "get the other user" among two users of a VPValue, as done above when setting Cmp above to the user of MinMaxPhiR other than MinMaxOp (using to_vector etc.) and here setting FindIVResult to the user of Sel other than FindIVPhiR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code to use the common findComputeReductionResult

if (!FindIVResult)
return false;
VPBuilder B(FindIVResult);
VPInstruction *MinMaxResult = B.createNaryOp(
VPInstruction::ComputeReductionResult,
{MinMaxPhiR, MinMaxPhiR->getBackedgeValue()}, VPIRFlags(), {});
MinMaxPhiR->getBackedgeValue()->replaceUsesWithIf(
MinMaxResult, [](VPUser &U, unsigned) { return isa<VPPhi>(&U); });
auto *FinalMinMaxCmp = B.createICmp(
CmpInst::ICMP_EQ, MinMaxResult->getOperand(1), MinMaxResult);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to explicitly Broadcast MinMaxResult before feeding it to compare with its operand(1)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this stage, the broadcasts are still implicit and will introduced explicitly at a later stage.

auto *FinalIVSelect =
B.createSelect(FinalMinMaxCmp, FindIVResult->getOperand(3),
FindIVResult->getOperand(2));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth assigning variables with meaningful names to the above operand(1) and these operand(2),(3), especially in the absence of documenting these VPInstruction opcodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

FindIVResult->setOperand(3, FinalIVSelect);
}
return true;
}
Loading