Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -978,7 +978,9 @@ class LoopVectorizationCostModel {
InterleavedAccessInfo &IAI)
: ScalarEpilogueStatus(SEL), TheLoop(L), PSE(PSE), LI(LI), Legal(Legal),
TTI(TTI), TLI(TLI), DB(DB), AC(AC), ORE(ORE), TheFunction(F),
Hints(Hints), InterleaveInfo(IAI), CostKind(TTI::TCK_RecipThroughput) {}
Hints(Hints), InterleaveInfo(IAI) {
CostKind = F->hasMinSize() ? TTI::TCK_CodeSize : TTI::TCK_RecipThroughput;
}

/// \return An upper bound for the vectorization factors (both fixed and
/// scalable). If the factors are 0, vectorization and interleaving should be
Expand Down Expand Up @@ -4277,6 +4279,13 @@ bool LoopVectorizationPlanner::isMoreProfitable(
EstimatedWidthB *= *VScale;
}

// When optimizing for size choose whichever is smallest, which will be the
// one with the smallest cost for the whole loop. On a tie pick the larger
// vector width, on the assumption that throughput will be greater.
if (CM.CostKind == TTI::TCK_CodeSize)
return CostA < CostB ||
(CostA == CostB && EstimatedWidthA > EstimatedWidthB);

// Assume vscale may be larger than 1 (or the value being tuned for),
// so that scalable vectorization is slightly favorable over fixed-width
// vectorization.
Expand Down Expand Up @@ -5506,7 +5515,8 @@ InstructionCost LoopVectorizationCostModel::computePredInstDiscount(
}

// Scale the total scalar cost by block probability.
ScalarCost /= getReciprocalPredBlockProb();
if (CostKind != TTI::TCK_CodeSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the checks here and below may not be covered by the existing tests. Would be good to add test coverage for some of them, if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to come up with a test to specifically test this, as any test I come up with the cost is large enough that incorrectly halving it doesn't matter because it's still larger than the scalar cost. If getReciprocalPredBlockProb used the actual block probability, instead of assuming it's 0.5 as it currently does, then I could probably do it by setting the block probability to close to zero.

ScalarCost /= getReciprocalPredBlockProb();

// Compute the discount. A non-negative discount means the vector version
// of the instruction costs more, and scalarizing would be beneficial.
Expand Down Expand Up @@ -5558,7 +5568,8 @@ InstructionCost LoopVectorizationCostModel::expectedCost(ElementCount VF) {
// the predicated block, if it is an if-else block. Thus, scale the block's
// cost by the probability of executing it. blockNeedsPredication from
// Legal is used so as to not include all blocks in tail folded loops.
if (VF.isScalar() && Legal->blockNeedsPredication(BB))
if (VF.isScalar() && Legal->blockNeedsPredication(BB) &&
CostKind != TTI::TCK_CodeSize)
BlockCost /= getReciprocalPredBlockProb();

Cost += BlockCost;
Expand Down Expand Up @@ -5637,7 +5648,8 @@ LoopVectorizationCostModel::getMemInstScalarizationCost(Instruction *I,
// conditional branches, but may not be executed for each vector lane. Scale
// the cost by the probability of executing the predicated block.
if (isPredicatedInst(I)) {
Cost /= getReciprocalPredBlockProb();
if (CostKind != TTI::TCK_CodeSize)
Cost /= getReciprocalPredBlockProb();

// Add the cost of an i1 extract and a branch
auto *VecI1Ty =
Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/Transforms/Vectorize/VPlan.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -793,7 +793,7 @@ InstructionCost VPRegionBlock::cost(ElementCount VF, VPCostContext &Ctx) {

// For the scalar case, we may not always execute the original predicated
// block, Thus, scale the block's cost by the probability of executing it.
if (VF.isScalar())
if (VF.isScalar() && Ctx.CostKind != TTI::TCK_CodeSize)
return ThenCost / getReciprocalPredBlockProb();

return ThenCost;
Expand Down
Loading
Loading