Skip to content

Commit ddfb8a8

Browse files
[LoopVectorize] Make needsExtract notice scalarized instructions
LoopVectorizationCostModel::needsExtract should recognise instructions that have been widened by scalarizing as scalar instructions, and thus not needing an extract when used by later scalarized instructions. This fixes an incorrect cost calculation in computePredInstDiscount, where we are adding a scalarization overhead cost when we shouldn't, though I haven't come up with a test case where it makes a difference. It will make a difference when the cost model switches to using the cost kind TCK_CodeSize for optsize, as not doing this causes the test LoopVectorize/X86/small-size.ll to get worse.
1 parent 86779da commit ddfb8a8

File tree

3 files changed

+164
-163
lines changed

3 files changed

+164
-163
lines changed

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1744,7 +1744,8 @@ class LoopVectorizationCostModel {
17441744
bool needsExtract(Value *V, ElementCount VF) const {
17451745
Instruction *I = dyn_cast<Instruction>(V);
17461746
if (VF.isScalar() || !I || !TheLoop->contains(I) ||
1747-
TheLoop->isLoopInvariant(I))
1747+
TheLoop->isLoopInvariant(I) ||
1748+
getWideningDecision(I, VF) == CM_Scalarize)
17481749
return false;
17491750

17501751
// Assume we can vectorize V (and hence we need extraction) if the

llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,8 +170,8 @@ entry:
170170
; VF_2-LABEL: Checking a loop in 'i64_factor_8'
171171
; VF_2: Found an estimated cost of 8 for VF 2 For instruction: %tmp2 = load i64, ptr %tmp0, align 8
172172
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: %tmp3 = load i64, ptr %tmp1, align 8
173-
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
174-
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
173+
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
174+
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
175175
for.body:
176176
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
177177
%tmp0 = getelementptr inbounds %i64.8, ptr %data, i64 %i, i32 2

0 commit comments

Comments
 (0)