Skip to content

Commit 99fb0fd

Browse files
committed
[AArch64][LoopVectorize] Enable tail-folding on neoverse-v2
This patch enables tail-folding of simple loops by default when targeting the neoverse-v2 CPU. This was done for neoverse-v1 in c7dbe32. For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small wins: 549.fotonik3d_r: ~3.2% 525.x264_r: ~2.7% 554.roms_r: ~1.2%
1 parent 4cde945 commit 99fb0fd

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

llvm/lib/Target/AArch64/AArch64Subtarget.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,8 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
268268
MaxBytesForLoopAlignment = 16;
269269
break;
270270
case NeoverseV2:
271+
DefaultSVETFOpts = TailFoldingOpts::Simple;
272+
LLVM_FALLTHROUGH;
271273
case NeoverseV3:
272274
EpilogueVectorizationMinVF = 8;
273275
MaxInterleaveFactor = 4;

llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@
1111
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -sve-tail-folding=default -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
1212
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 -sve-tail-folding=default | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
1313
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
14+
; Simple tail-folding is also enabled by default on neoverse-v2. Use same check prefix.
15+
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v2 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
1416

1517
target triple = "aarch64-unknown-linux-gnu"
1618

0 commit comments

Comments
 (0)