Skip to content

Commit 350cb98

Browse files
authored
[X86] Explicitly widen larger than v4f16 to the legal v8f16 (NFC) (#153839)
This patch makes the current behavior explicit to prepare for adding VTs for v[567]f16. Right now these types are EVTs and hence don't fall under getPreferredVectorAction and are simply widened to the next legal power-of-two vector type. For SSE2 this is v8f16. Without the preparatory patch however, the behavior would change after adding these types. getPreferredVectorAction would try to split them because this is the current behavior for any f16 vector type that is not legal. There is a lot more detail at #152150 in particular how splitting these new types leads to an inconsistency between NumRegistersForVT and getTypeAction. The patch ensures that after the new types are added they would continue to be widened rather than split. Once the patch to enable v[567]f16 lands, it will be an NFC for x86.
1 parent 0561ff6 commit 350cb98

File tree

2 files changed

+17
-1
lines changed

2 files changed

+17
-1
lines changed

llvm/lib/Target/X86/X86ISelLowering.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2756,8 +2756,10 @@ X86TargetLowering::getPreferredVectorAction(MVT VT) const {
27562756
!Subtarget.hasBWI())
27572757
return TypeSplitVector;
27582758

2759+
// Since v8f16 is legal, widen anything over v4f16.
27592760
if (!VT.isScalableVector() && VT.getVectorNumElements() != 1 &&
2760-
!Subtarget.hasF16C() && VT.getVectorElementType() == MVT::f16)
2761+
VT.getVectorNumElements() <= 4 && !Subtarget.hasF16C() &&
2762+
VT.getVectorElementType() == MVT::f16)
27612763
return TypeSplitVector;
27622764

27632765
if (!VT.isScalableVector() && VT.getVectorNumElements() != 1 &&

llvm/test/CodeGen/X86/pr152150.ll

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
; RUN: llc < %s -mtriple=x86_64-unknown-unknown-eabi-elf | FileCheck %s
2+
3+
; CHECK-LABEL: conv2d
4+
define dso_local void @conv2d() {
5+
.preheader:
6+
br label %0
7+
8+
0: ; preds = %0, %.preheader
9+
%1 = phi [4 x <7 x half>] [ zeroinitializer, %.preheader ], [ %4, %0 ]
10+
%2 = extractvalue [4 x <7 x half>] %1, 0
11+
%3 = extractvalue [4 x <7 x half>] %1, 1
12+
%4 = insertvalue [4 x <7 x half>] poison, <7 x half> poison, 3
13+
br label %0
14+
}

0 commit comments

Comments
 (0)