Skip to content

Commit 611401c

Browse files
committed
[CostModel][X86] getShuffleCost - use processShuffleMasks to split SK_PermuteTwoSrc shuffles to legal types (#120599)
processShuffleMasks can now correctly handle 2 src shuffles, so we can use the existing SK_PermuteSingleSrc splitting cost logic to handle SK_PermuteTwoSrc as well and correctly recognise the number of active subvectors per legalised shuffle.
1 parent 1e18815 commit 611401c

23 files changed

+1166
-978
lines changed

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1698,7 +1698,8 @@ InstructionCost X86TTIImpl::getShuffleCost(
16981698
// We are going to permute multiple sources and the result will be in multiple
16991699
// destinations. Providing an accurate cost only for splits where the element
17001700
// type remains the same.
1701-
if (Kind == TTI::SK_PermuteSingleSrc && LT.first != 1) {
1701+
if ((Kind == TTI::SK_PermuteSingleSrc || Kind == TTI::SK_PermuteTwoSrc) &&
1702+
LT.first != 1) {
17021703
MVT LegalVT = LT.second;
17031704
if (LegalVT.isVector() &&
17041705
LegalVT.getVectorElementType().getSizeInBits() ==
@@ -1784,14 +1785,6 @@ InstructionCost X86TTIImpl::getShuffleCost(
17841785
return BaseT::getShuffleCost(Kind, BaseTp, Mask, CostKind, Index, SubTp);
17851786
}
17861787

1787-
// For 2-input shuffles, we must account for splitting the 2 inputs into many.
1788-
if (Kind == TTI::SK_PermuteTwoSrc && !IsInLaneShuffle && LT.first != 1) {
1789-
// We assume that source and destination have the same vector type.
1790-
InstructionCost NumOfDests = LT.first;
1791-
InstructionCost NumOfShufflesPerDest = LT.first * 2 - 1;
1792-
LT.first = NumOfDests * NumOfShufflesPerDest;
1793-
}
1794-
17951788
static const CostTblEntry AVX512VBMIShuffleTbl[] = {
17961789
{TTI::SK_Reverse, MVT::v64i8, 1}, // vpermb
17971790
{TTI::SK_Reverse, MVT::v32i8, 1}, // vpermb

llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector-codesize.ll

Lines changed: 47 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector-latency.ll

Lines changed: 47 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector-sizelatency.ll

Lines changed: 47 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector.ll

Lines changed: 47 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-codesize.ll

Lines changed: 76 additions & 76 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-latency.ll

Lines changed: 76 additions & 76 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-sizelatency.ll

Lines changed: 76 additions & 76 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector.ll

Lines changed: 76 additions & 76 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-two-src-codesize.ll

Lines changed: 128 additions & 82 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)