Skip to content

Commit 98867bf

Browse files
authored
[AArch64] [CostModel] Fix cost modelling for saturating arithmetic intrinsics (#152333)
The cost model previously overestimating throughput costs to wide fixed-length saturating arithmetic intrinsics when using SVE with a fixed vscale of 2. These costs ended up much higher than for the same operations using NEON, despite being fully legal and efficient with SVE. This patch adjusts the cost model to avoid penalising these intrinsics under SVE.
1 parent acda808 commit 98867bf

File tree

3 files changed

+489
-208
lines changed

3 files changed

+489
-208
lines changed

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -643,6 +643,13 @@ AArch64TTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
643643
LT.second.getScalarSizeInBits() == RetTy->getScalarSizeInBits() ? 1 : 4;
644644
if (any_of(ValidSatTys, [&LT](MVT M) { return M == LT.second; }))
645645
return LT.first * Instrs;
646+
647+
TypeSize TS = getDataLayout().getTypeSizeInBits(RetTy);
648+
uint64_t VectorSize = TS.getKnownMinValue();
649+
650+
if (ST->isSVEAvailable() && VectorSize >= 128 && isPowerOf2_64(VectorSize))
651+
return LT.first * Instrs;
652+
646653
break;
647654
}
648655
case Intrinsic::abs: {

0 commit comments

Comments
 (0)