Skip to content

Commit 8f44611

Browse files
david-armfhahn
authored andcommitted
[LV] Stop using the legacy cost model for udiv + friends (llvm#152707)
In VPWidenRecipe::computeCost for the instructions udiv, sdiv, urem and srem we fall back on the legacy cost unnecessarily. At this point we know that the vplan must be functionally correct, i.e. if the divide/remainder is not safe to speculatively execute then we must have either: 1. Scalarised the operation, in which case we wouldn't be using a VPWidenRecipe, or 2. We've inserted a select for the second operand to ensure we don't fault through divide-by-zero. For 2) it's necessary to add the select operation to VPInstruction::computeCost so that we mirror the cost of the legacy cost model. The only problem with this is that we also generate selects in vplan for predicated loops with reductions, which *aren't* accounted for in the legacy cost model. In order to prevent asserts firing I've also added the selects to precomputeCosts to ensure the legacy costs match the vplan costs for reductions. (cherry picked from commit d606eae)
1 parent add4b91 commit 8f44611

File tree

2 files changed

+36
-2
lines changed

2 files changed

+36
-2
lines changed

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4294,6 +4294,25 @@ VectorizationFactor LoopVectorizationPlanner::selectVectorizationFactor() {
42944294
if (!VPI)
42954295
continue;
42964296
switch (VPI->getOpcode()) {
4297+
// Selects are only modelled in the legacy cost model for safe
4298+
// divisors.
4299+
case Instruction::Select: {
4300+
VPValue *VPV = VPI->getVPSingleValue();
4301+
if (VPV->getNumUsers() == 1) {
4302+
if (auto *WR = dyn_cast<VPWidenRecipe>(*VPV->user_begin())) {
4303+
switch (WR->getOpcode()) {
4304+
case Instruction::UDiv:
4305+
case Instruction::SDiv:
4306+
case Instruction::URem:
4307+
case Instruction::SRem:
4308+
continue;
4309+
default:
4310+
break;
4311+
}
4312+
}
4313+
}
4314+
[[fallthrough]];
4315+
}
42974316
case VPInstruction::ActiveLaneMask:
42984317
case VPInstruction::ExplicitVectorLength:
42994318
C += VPI->cost(VF, CostCtx);

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -990,6 +990,19 @@ InstructionCost VPInstruction::computeCost(ElementCount VF,
990990
}
991991

992992
switch (getOpcode()) {
993+
case Instruction::Select: {
994+
// TODO: It may be possible to improve this by analyzing where the
995+
// condition operand comes from.
996+
CmpInst::Predicate Pred = CmpInst::BAD_ICMP_PREDICATE;
997+
auto *CondTy = Ctx.Types.inferScalarType(getOperand(0));
998+
auto *VecTy = Ctx.Types.inferScalarType(getOperand(1));
999+
if (!vputils::onlyFirstLaneUsed(this)) {
1000+
CondTy = toVectorTy(CondTy, VF);
1001+
VecTy = toVectorTy(VecTy, VF);
1002+
}
1003+
return Ctx.TTI.getCmpSelInstrCost(Instruction::Select, VecTy, CondTy, Pred,
1004+
Ctx.CostKind);
1005+
}
9931006
case Instruction::ExtractElement: {
9941007
// Add on the cost of extracting the element.
9951008
auto *VecTy = toVectorTy(Ctx.Types.inferScalarType(getOperand(0)), VF);
@@ -2044,8 +2057,10 @@ InstructionCost VPWidenRecipe::computeCost(ElementCount VF,
20442057
case Instruction::SDiv:
20452058
case Instruction::SRem:
20462059
case Instruction::URem:
2047-
// More complex computation, let the legacy cost-model handle this for now.
2048-
return Ctx.getLegacyCost(cast<Instruction>(getUnderlyingValue()), VF);
2060+
// If the div/rem operation isn't safe to speculate and requires
2061+
// predication, then the only way we can even create a vplan is to insert
2062+
// a select on the second input operand to ensure we use the value of 1
2063+
// for the inactive lanes. The select will be costed separately.
20492064
case Instruction::FNeg:
20502065
case Instruction::Add:
20512066
case Instruction::FAdd:

0 commit comments

Comments
 (0)