Skip to content

Commit fd3df1e

Browse files
committed
[Clang] Freeze padded vectors before storing.
Currently Clang usually leaves padding bits uninitialized, which means they are undef at the moment. When expanding stores of vector types to include padding, the padding lanes will be poison, hence the padding bits will be poison. This interacts badly with coercion of arguments and return values, where 3 x float vectors will be loaded as i128 integer; poisoning the padding bits will make the whole value poison.
1 parent 751a31d commit fd3df1e

File tree

2 files changed

+7
-3
lines changed

2 files changed

+7
-3
lines changed

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2300,6 +2300,9 @@ void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
23002300
SmallVector<int, 16> Mask(NewVecTy->getNumElements(), -1);
23012301
std::iota(Mask.begin(), Mask.begin() + VecTy->getNumElements(), 0);
23022302
Value = Builder.CreateShuffleVector(Value, Mask, "extractVec");
2303+
// The extra lanes will be poison. Freeze the whole vector to make sure
2304+
// the padding memory is not poisoned, which may break coercion.
2305+
Value = Builder.CreateFreeze(Value);
23032306
SrcTy = NewVecTy;
23042307
}
23052308
if (Addr.getElementType() != SrcTy)

clang/test/CodeGen/AArch64/ext-vector-coercion.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,10 +29,11 @@ struct Vec3 {
2929
// CHECK-NEXT: [[ADD:%.*]] = fadd <3 x float> [[EXTRACTVEC]], [[EXTRACTVEC2]]
3030
// CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[RETVAL]], i32 0, i32 0
3131
// CHECK-NEXT: [[EXTRACTVEC3:%.*]] = shufflevector <3 x float> [[ADD]], <3 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 poison>
32-
// CHECK-NEXT: store <4 x float> [[EXTRACTVEC3]], ptr [[TMP2]], align 16
32+
// CHECK-NEXT: [[TMP3:%.*]] = freeze <4 x float> [[EXTRACTVEC3]]
33+
// CHECK-NEXT: store <4 x float> [[TMP3]], ptr [[TMP2]], align 16
3334
// CHECK-NEXT: [[COERCE_DIVE4:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[RETVAL]], i32 0, i32 0
34-
// CHECK-NEXT: [[TMP3:%.*]] = load i128, ptr [[COERCE_DIVE4]], align 16
35-
// CHECK-NEXT: ret i128 [[TMP3]]
35+
// CHECK-NEXT: [[TMP4:%.*]] = load i128, ptr [[COERCE_DIVE4]], align 16
36+
// CHECK-NEXT: ret i128 [[TMP4]]
3637
//
3738
struct Vec3 add(struct Vec3 a) {
3839
struct Vec3 res;

0 commit comments

Comments
 (0)