-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[HLSL] Implement HLSL intialization list support #123141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HLSL] Implement HLSL intialization list support #123141
Conversation
@llvm/pr-subscribers-hlsl Author: Chris B (llvm-beanz) ChangesThis PR implements HLSL's initialization list behvaior as specified in the draft language specifcation under This behavior is a bit unusual for C/C++ because intermediate braces in initializer lists are ignored and a whole array of additional conversions occur unintuitively to how initializaiton works in C. The implementaiton in this PR generates a valid C/C++ initialization list AST for the HLSL initializer so that there are no changes required to Clang's CodeGen to support this. This design will also allow us to use Clang's rewrite to convert HLSL initializers to valid C/C++ initializers that are equivalent. It does have the downside that it will generate often redundant accesses during codegen. The IR optimizer is extremely good at eliminating those so this will have no impact on the final executable performance. There is some opportunity for optimizing the initializer list generation that we could consider in subsequent commits. One notable opportunity would be to identify aggregate objects that occur in the same place in both initializers and do not require converison, those aggregates could be initialized as aggregates rather than fully scalarized. Closes #56067 Patch is 72.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123141.diff 9 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 67c15e7c475943..9db77aba230455 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12574,6 +12574,9 @@ def err_hlsl_pointers_unsupported : Error<
"%select{pointers|references}0 are unsupported in HLSL">;
def err_hlsl_missing_resource_class : Error<"HLSL resource needs to have [[hlsl::resource_class()]] attribute">;
def err_hlsl_attribute_needs_intangible_type: Error<"attribute %0 can be used only on HLSL intangible type %1">;
+def err_hlsl_incorrect_num_initializers: Error<
+ "too %select{few|many}0 initializers in list for type %1 "
+ "(expected %2 but found %3)">;
def err_hlsl_operator_unsupported : Error<
"the '%select{&|*|->}0' operator is unsupported in HLSL">;
diff --git a/clang/include/clang/Sema/SemaHLSL.h b/clang/include/clang/Sema/SemaHLSL.h
index f4cd11f423a84a..092691d08761c4 100644
--- a/clang/include/clang/Sema/SemaHLSL.h
+++ b/clang/include/clang/Sema/SemaHLSL.h
@@ -26,6 +26,8 @@
namespace clang {
class AttributeCommonInfo;
class IdentifierInfo;
+class InitializedEntity;
+class InitializationKind;
class ParsedAttr;
class Scope;
class VarDecl;
@@ -145,6 +147,9 @@ class SemaHLSL : public SemaBase {
QualType getInoutParameterType(QualType Ty);
+ bool TransformInitList(const InitializedEntity &Entity,
+ const InitializationKind &Kind, InitListExpr *Init);
+
private:
// HLSL resource type attributes need to be processed all at once.
// This is a list to collect them.
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 881907ac311a30..cc748e432001e0 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -11625,9 +11625,12 @@ static void AnalyzeImplicitConversions(
// Propagate whether we are in a C++ list initialization expression.
// If so, we do not issue warnings for implicit int-float conversion
- // precision loss, because C++11 narrowing already handles it.
- bool IsListInit = Item.IsListInit ||
- (isa<InitListExpr>(OrigE) && S.getLangOpts().CPlusPlus);
+ // precision loss, because C++11 narrowing already handles it. HLSL's
+ // initialization lists are special, so they shouldn't observe the C++
+ // behavior here.
+ bool IsListInit =
+ Item.IsListInit || (isa<InitListExpr>(OrigE) &&
+ S.getLangOpts().CPlusPlus && !S.getLangOpts().HLSL);
if (E->isTypeDependent() || E->isValueDependent())
return;
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 65ddee05a21512..f9f1473d4e0bab 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -2576,3 +2576,162 @@ void SemaHLSL::processExplicitBindingsOnDecl(VarDecl *VD) {
}
}
}
+
+static bool CastInitializer(Sema &S, ASTContext &Ctx, Expr *E,
+ llvm::SmallVectorImpl<Expr *> &List,
+ llvm::SmallVectorImpl<QualType> &DestTypes) {
+ if (List.size() >= DestTypes.size())
+ return false;
+ InitializedEntity Entity =
+ InitializedEntity::InitializeParameter(Ctx, DestTypes[List.size()], false);
+ ExprResult Res =
+ S.PerformCopyInitialization(Entity, E->getBeginLoc(), E);
+ if (Res.isInvalid())
+ return false;
+ Expr *Init = Res.get();
+ List.push_back(Init);
+ return true;
+}
+
+static void BuildIntializerList(Sema &S, ASTContext &Ctx, Expr *E,
+ llvm::SmallVectorImpl<Expr *> &List,
+ llvm::SmallVectorImpl<QualType> &DestTypes,
+ bool &ExcessInits) {
+ if (List.size() >= DestTypes.size()) {
+ ExcessInits = true;
+ return;
+ }
+
+ // If this is an initialization list, traverse the sub initializers.
+ if (auto *Init = dyn_cast<InitListExpr>(E)) {
+ for (auto *SubInit : Init->inits())
+ BuildIntializerList(S, Ctx, SubInit, List, DestTypes, ExcessInits);
+ return;
+ }
+
+ // If this is a scalar type, just enqueue the expression.
+ QualType Ty = E->getType();
+ if (Ty->isScalarType()) {
+ (void)CastInitializer(S, Ctx, E, List, DestTypes);
+ return;
+ }
+
+ if (auto *ATy = Ty->getAs<VectorType>()) {
+ uint64_t Size = ATy->getNumElements();
+
+ if (List.size() + Size > DestTypes.size()) {
+ ExcessInits = true;
+ return;
+ }
+ QualType SizeTy = Ctx.getSizeType();
+ uint64_t SizeTySize = Ctx.getTypeSize(SizeTy);
+ for (uint64_t I = 0; I < Size; ++I) {
+ auto *Idx = IntegerLiteral::Create(Ctx, llvm::APInt(SizeTySize, I),
+ SizeTy, SourceLocation());
+
+ ExprResult ElExpr = S.CreateBuiltinArraySubscriptExpr(
+ E, E->getBeginLoc(), Idx, E->getEndLoc());
+ if (ElExpr.isInvalid())
+ return;
+ if (!CastInitializer(S, Ctx, ElExpr.get(), List, DestTypes))
+ return;
+ }
+ return;
+ }
+
+ if (auto *VTy = dyn_cast<ConstantArrayType>(Ty.getTypePtr())) {
+ uint64_t Size = VTy->getZExtSize();
+ QualType SizeTy = Ctx.getSizeType();
+ uint64_t SizeTySize = Ctx.getTypeSize(SizeTy);
+ for (uint64_t I = 0; I < Size; ++I) {
+ auto *Idx = IntegerLiteral::Create(Ctx, llvm::APInt(SizeTySize, I),
+ SizeTy, SourceLocation());
+ ExprResult ElExpr = S.CreateBuiltinArraySubscriptExpr(
+ E, E->getBeginLoc(), Idx, E->getEndLoc());
+ if (ElExpr.isInvalid())
+ return;
+ BuildIntializerList(S, Ctx, ElExpr.get(), List, DestTypes, ExcessInits);
+ }
+ return;
+ }
+
+ if (auto *RTy = Ty->getAs<RecordType>()) {
+ for (auto *FD : RTy->getDecl()->fields()) {
+ DeclAccessPair Found = DeclAccessPair::make(FD, FD->getAccess());
+ DeclarationNameInfo NameInfo(FD->getDeclName(), E->getBeginLoc());
+ ExprResult Res = S.BuildFieldReferenceExpr(
+ E, false, E->getBeginLoc(), CXXScopeSpec(), FD, Found, NameInfo);
+ if (Res.isInvalid())
+ return;
+ BuildIntializerList(S, Ctx, Res.get(), List, DestTypes, ExcessInits);
+ }
+ }
+}
+
+static Expr *GenerateInitLists(ASTContext &Ctx, QualType Ty,
+ llvm::SmallVectorImpl<Expr *>::iterator &It) {
+ if (Ty->isScalarType()) {
+ return *(It++);
+ }
+ llvm::SmallVector<Expr *> Inits;
+ assert(!isa<MatrixType>(Ty) && "Matrix types not yet supported in HLSL");
+ if (Ty->isVectorType() || Ty->isConstantArrayType()) {
+ QualType ElTy;
+ uint64_t Size = 0;
+ if (auto *ATy = Ty->getAs<VectorType>()) {
+ ElTy = ATy->getElementType();
+ Size = ATy->getNumElements();
+ } else {
+ auto *VTy = cast<ConstantArrayType>(Ty.getTypePtr());
+ ElTy = VTy->getElementType();
+ Size = VTy->getZExtSize();
+ }
+ for (uint64_t I = 0; I < Size; ++I)
+ Inits.push_back(GenerateInitLists(Ctx, ElTy, It));
+ }
+ if (const RecordDecl *RD = Ty->getAsRecordDecl()) {
+ for (auto *FD : RD->fields()) {
+ Inits.push_back(GenerateInitLists(Ctx, FD->getType(), It));
+ }
+ }
+ auto *NewInit = new (Ctx) InitListExpr(Ctx, Inits.front()->getBeginLoc(),
+ Inits, Inits.back()->getEndLoc());
+ NewInit->setType(Ty);
+ return NewInit;
+}
+
+bool SemaHLSL::TransformInitList(const InitializedEntity &Entity,
+ const InitializationKind &Kind,
+ InitListExpr *Init) {
+ // If the initializer is a scalar, just return it.
+ if (Init->getType()->isScalarType())
+ return true;
+ ASTContext &Ctx = SemaRef.getASTContext();
+ llvm::SmallVector<QualType, 16> DestTypes;
+ // An initializer list might be attempting to initialize a reference or
+ // rvalue-reference. When checking the initializer we should look through the
+ // reference.
+ QualType InitTy = Entity.getType().getNonReferenceType();
+ BuildFlattenedTypeList(InitTy, DestTypes);
+
+ llvm::SmallVector<Expr *, 16> ArgExprs;
+ bool ExcessInits = false;
+ for (Expr *Arg : Init->inits())
+ BuildIntializerList(SemaRef, Ctx, Arg, ArgExprs, DestTypes, ExcessInits);
+
+ if (DestTypes.size() != ArgExprs.size() || ExcessInits) {
+ int TooManyOrFew = ExcessInits ? 1 : 0;
+ SemaRef.Diag(Init->getBeginLoc(), diag::err_hlsl_incorrect_num_initializers)
+ << TooManyOrFew << InitTy << DestTypes.size() << ArgExprs.size();
+ return false;
+ }
+
+ auto It = ArgExprs.begin();
+ // GenerateInitLists will always return an InitListExpr here, because the
+ // scalar case is handled above.
+ auto *NewInit = cast<InitListExpr>(GenerateInitLists(Ctx, InitTy, It));
+ Init->resizeInits(Ctx, NewInit->getNumInits());
+ for (unsigned I = 0; I < NewInit->getNumInits(); ++I)
+ Init->updateInit(Ctx, I, NewInit->getInit(I));
+ return true;
+}
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index b95cbbf4222056..a3c56d37f8b0cd 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -26,6 +26,7 @@
#include "clang/Sema/Initialization.h"
#include "clang/Sema/Lookup.h"
#include "clang/Sema/Ownership.h"
+#include "clang/Sema/SemaHLSL.h"
#include "clang/Sema/SemaObjC.h"
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/FoldingSet.h"
@@ -4783,6 +4784,10 @@ static void TryListInitialization(Sema &S,
bool TreatUnavailableAsInvalid) {
QualType DestType = Entity.getType();
+ if (S.getLangOpts().HLSL &&
+ !S.HLSL().TransformInitList(Entity, Kind, InitList))
+ return;
+
// C++ doesn't allow scalar initialization with more than one argument.
// But C99 complex numbers are scalars and it makes sense there.
if (S.getLangOpts().CPlusPlus && DestType->isScalarType() &&
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index e5db7eac37a428..91a283554459d9 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -1,3 +1,4 @@
+
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s
void fn(float x[2]) { }
@@ -27,7 +28,7 @@ void fn2(Obj O[4]) { }
// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[Arr]], i32 32, i1 false)
// CHECK: call void {{.*}}fn2{{.*}}(ptr noundef byval([4 x %struct.Obj]) align 4 [[Tmp]])
void call2() {
- Obj Arr[4] = {};
+ Obj Arr[4] = {{0, 0}, {0, 0}, {0, 0}, {0, 0}};
fn2(Arr);
}
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
new file mode 100644
index 00000000000000..e57724b0ec31fb
--- /dev/null
+++ b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
@@ -0,0 +1,714 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-library -disable-llvm-passes -emit-llvm -finclude-default-header -o - %s | FileCheck %s
+
+struct TwoFloats {
+ float X, Y;
+};
+
+struct TwoInts {
+ int Z, W;
+};
+
+struct Doggo {
+ int4 LegState;
+ int TailState;
+ float HairCount;
+ float4 EarDirection[2];
+};
+
+struct AnimalBits {
+ int Legs[4];
+ uint State;
+ int64_t Counter;
+ float4 LeftDir;
+ float4 RightDir;
+};
+
+struct Kitteh {
+ int4 Legs;
+ int TailState;
+ float HairCount;
+ float4 Claws[2];
+};
+
+struct Zoo {
+ Doggo Dogs[2];
+ Kitteh Cats[4];
+};
+
+// Case 1: Extraneous braces get ignored in literal instantiation.
+// CHECK-LABEL: define void @_Z5case1v(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case1v.TF1, i32 8, i1 false)
+// CHECK-NEXT: ret void
+//
+TwoFloats case1() {
+ TwoFloats TF1 = {{{1.0, 2}}};
+ return TF1;
+}
+
+// Case 2: Valid C/C++ initializer is handled appropriately.
+// CHECK-LABEL: define void @_Z5case2v(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case2v.TF2, i32 8, i1 false)
+// CHECK-NEXT: ret void
+//
+TwoFloats case2() {
+ TwoFloats TF2 = {1, 2};
+ return TF2;
+}
+
+// Case 3: Simple initialization with conversion of an argument.
+// CHECK-LABEL: define void @_Z5case3i(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], i32 noundef [[VAL:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[VAL_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT: store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP0]] to float
+// CHECK-NEXT: store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT: [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: store float 2.000000e+00, ptr [[Y]], align 4
+// CHECK-NEXT: ret void
+//
+TwoFloats case3(int Val) {
+ TwoFloats TF3 = {Val, 2};
+ return TF3;
+}
+
+// Case 4: Initialization from a scalarized vector into a structure with element
+// conversions.
+// CHECK-LABEL: define void @_Z5case4Dv2_i(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
+// CHECK-NEXT: store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
+// CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[VECEXT]] to float
+// CHECK-NEXT: store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT: [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
+// CHECK-NEXT: [[CONV2:%.*]] = sitofp i32 [[VECEXT1]] to float
+// CHECK-NEXT: store float [[CONV2]], ptr [[Y]], align 4
+// CHECK-NEXT: ret void
+//
+TwoFloats case4(int2 TwoVals) {
+ TwoFloats TF4 = {TwoVals};
+ return TF4;
+}
+
+// Case 5: Initialization from a scalarized vector of matching type.
+// CHECK-LABEL: define void @_Z5case5Dv2_i(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
+// CHECK-NEXT: store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
+// CHECK-NEXT: store i32 [[VECEXT]], ptr [[Z]], align 4
+// CHECK-NEXT: [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
+// CHECK-NEXT: store i32 [[VECEXT1]], ptr [[W]], align 4
+// CHECK-NEXT: ret void
+//
+TwoInts case5(int2 TwoVals) {
+ TwoInts TI1 = {TwoVals};
+ return TI1;
+}
+
+// Case 6: Initialization from a scalarized structure of different type with
+// different element types.
+// CHECK-LABEL: define void @_Z5case69TwoFloats(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT: [[CONV:%.*]] = fptosi float [[TMP0]] to i32
+// CHECK-NEXT: store i32 [[CONV]], ptr [[Z]], align 4
+// CHECK-NEXT: [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load float, ptr [[Y]], align 4
+// CHECK-NEXT: [[CONV1:%.*]] = fptosi float [[TMP1]] to i32
+// CHECK-NEXT: store i32 [[CONV1]], ptr [[W]], align 4
+// CHECK-NEXT: ret void
+//
+TwoInts case6(TwoFloats TF4) {
+ TwoInts TI2 = {TF4};
+ return TI2;
+}
+
+// Case 7: Initialization of a complex structue, with bogus braces and element
+// conversions from a collection of scalar values, and structures.
+// CHECK-LABEL: define void @_Z5case77TwoIntsS_i9TwoFloatsS0_S0_S0_(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_DOGGO:%.*]]) align 16 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOINTS:%.*]]) align 4 [[TI1:%.*]], ptr noundef byval([[STRUCT_TWOINTS]]) align 4 [[TI2:%.*]], i32 noundef [[VAL:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF1:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF2:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF3:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[VAL_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT: store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[LEGSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Z]], align 4
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i32 0
+// CHECK-NEXT: [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[W]], align 4
+// CHECK-NEXT: [[VECINIT1:%.*]] = insertelement <4 x i32> [[VECINIT]], i32 [[TMP1]], i32 1
+// CHECK-NEXT: [[Z2:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 0
+// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[Z2]], align 4
+// CHECK-NEXT: [[VECINIT3:%.*]] = insertelement <4 x i32> [[VECINIT1]], i32 [[TMP2]], i32 2
+// CHECK-NEXT: [[W4:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 1
+// CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[W4]], align 4
+// CHECK-NEXT: [[VECINIT5:%.*]] = insertelement <4 x i32> [[VECINIT3]], i32 [[TMP3]], i32 3
+// CHECK-NEXT: store <4 x i32> [[VECINIT5]], ptr [[LEGSTATE]], align 16
+// CHECK-NEXT: [[TAILSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: store i32 [[TMP4]], ptr [[TAILSTATE]], align 16
+// CHECK-NEXT: [[HAIRCOUNT:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 2
+// CHECK-NEXT: [[TMP5:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP5]] to float
+// CHECK-NEXT: store float [[CONV]], ptr [[HAIRCOUNT]], align 4
+// CHECK-NEXT: [[EARDIRECTION:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 3
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF1]], i32 0, i32 0
+// CHECK-NEXT: [[TMP6:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT: ...
[truncated]
|
@llvm/pr-subscribers-clang Author: Chris B (llvm-beanz) ChangesThis PR implements HLSL's initialization list behvaior as specified in the draft language specifcation under This behavior is a bit unusual for C/C++ because intermediate braces in initializer lists are ignored and a whole array of additional conversions occur unintuitively to how initializaiton works in C. The implementaiton in this PR generates a valid C/C++ initialization list AST for the HLSL initializer so that there are no changes required to Clang's CodeGen to support this. This design will also allow us to use Clang's rewrite to convert HLSL initializers to valid C/C++ initializers that are equivalent. It does have the downside that it will generate often redundant accesses during codegen. The IR optimizer is extremely good at eliminating those so this will have no impact on the final executable performance. There is some opportunity for optimizing the initializer list generation that we could consider in subsequent commits. One notable opportunity would be to identify aggregate objects that occur in the same place in both initializers and do not require converison, those aggregates could be initialized as aggregates rather than fully scalarized. Closes #56067 Patch is 72.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123141.diff 9 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 67c15e7c475943..9db77aba230455 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12574,6 +12574,9 @@ def err_hlsl_pointers_unsupported : Error<
"%select{pointers|references}0 are unsupported in HLSL">;
def err_hlsl_missing_resource_class : Error<"HLSL resource needs to have [[hlsl::resource_class()]] attribute">;
def err_hlsl_attribute_needs_intangible_type: Error<"attribute %0 can be used only on HLSL intangible type %1">;
+def err_hlsl_incorrect_num_initializers: Error<
+ "too %select{few|many}0 initializers in list for type %1 "
+ "(expected %2 but found %3)">;
def err_hlsl_operator_unsupported : Error<
"the '%select{&|*|->}0' operator is unsupported in HLSL">;
diff --git a/clang/include/clang/Sema/SemaHLSL.h b/clang/include/clang/Sema/SemaHLSL.h
index f4cd11f423a84a..092691d08761c4 100644
--- a/clang/include/clang/Sema/SemaHLSL.h
+++ b/clang/include/clang/Sema/SemaHLSL.h
@@ -26,6 +26,8 @@
namespace clang {
class AttributeCommonInfo;
class IdentifierInfo;
+class InitializedEntity;
+class InitializationKind;
class ParsedAttr;
class Scope;
class VarDecl;
@@ -145,6 +147,9 @@ class SemaHLSL : public SemaBase {
QualType getInoutParameterType(QualType Ty);
+ bool TransformInitList(const InitializedEntity &Entity,
+ const InitializationKind &Kind, InitListExpr *Init);
+
private:
// HLSL resource type attributes need to be processed all at once.
// This is a list to collect them.
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 881907ac311a30..cc748e432001e0 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -11625,9 +11625,12 @@ static void AnalyzeImplicitConversions(
// Propagate whether we are in a C++ list initialization expression.
// If so, we do not issue warnings for implicit int-float conversion
- // precision loss, because C++11 narrowing already handles it.
- bool IsListInit = Item.IsListInit ||
- (isa<InitListExpr>(OrigE) && S.getLangOpts().CPlusPlus);
+ // precision loss, because C++11 narrowing already handles it. HLSL's
+ // initialization lists are special, so they shouldn't observe the C++
+ // behavior here.
+ bool IsListInit =
+ Item.IsListInit || (isa<InitListExpr>(OrigE) &&
+ S.getLangOpts().CPlusPlus && !S.getLangOpts().HLSL);
if (E->isTypeDependent() || E->isValueDependent())
return;
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 65ddee05a21512..f9f1473d4e0bab 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -2576,3 +2576,162 @@ void SemaHLSL::processExplicitBindingsOnDecl(VarDecl *VD) {
}
}
}
+
+static bool CastInitializer(Sema &S, ASTContext &Ctx, Expr *E,
+ llvm::SmallVectorImpl<Expr *> &List,
+ llvm::SmallVectorImpl<QualType> &DestTypes) {
+ if (List.size() >= DestTypes.size())
+ return false;
+ InitializedEntity Entity =
+ InitializedEntity::InitializeParameter(Ctx, DestTypes[List.size()], false);
+ ExprResult Res =
+ S.PerformCopyInitialization(Entity, E->getBeginLoc(), E);
+ if (Res.isInvalid())
+ return false;
+ Expr *Init = Res.get();
+ List.push_back(Init);
+ return true;
+}
+
+static void BuildIntializerList(Sema &S, ASTContext &Ctx, Expr *E,
+ llvm::SmallVectorImpl<Expr *> &List,
+ llvm::SmallVectorImpl<QualType> &DestTypes,
+ bool &ExcessInits) {
+ if (List.size() >= DestTypes.size()) {
+ ExcessInits = true;
+ return;
+ }
+
+ // If this is an initialization list, traverse the sub initializers.
+ if (auto *Init = dyn_cast<InitListExpr>(E)) {
+ for (auto *SubInit : Init->inits())
+ BuildIntializerList(S, Ctx, SubInit, List, DestTypes, ExcessInits);
+ return;
+ }
+
+ // If this is a scalar type, just enqueue the expression.
+ QualType Ty = E->getType();
+ if (Ty->isScalarType()) {
+ (void)CastInitializer(S, Ctx, E, List, DestTypes);
+ return;
+ }
+
+ if (auto *ATy = Ty->getAs<VectorType>()) {
+ uint64_t Size = ATy->getNumElements();
+
+ if (List.size() + Size > DestTypes.size()) {
+ ExcessInits = true;
+ return;
+ }
+ QualType SizeTy = Ctx.getSizeType();
+ uint64_t SizeTySize = Ctx.getTypeSize(SizeTy);
+ for (uint64_t I = 0; I < Size; ++I) {
+ auto *Idx = IntegerLiteral::Create(Ctx, llvm::APInt(SizeTySize, I),
+ SizeTy, SourceLocation());
+
+ ExprResult ElExpr = S.CreateBuiltinArraySubscriptExpr(
+ E, E->getBeginLoc(), Idx, E->getEndLoc());
+ if (ElExpr.isInvalid())
+ return;
+ if (!CastInitializer(S, Ctx, ElExpr.get(), List, DestTypes))
+ return;
+ }
+ return;
+ }
+
+ if (auto *VTy = dyn_cast<ConstantArrayType>(Ty.getTypePtr())) {
+ uint64_t Size = VTy->getZExtSize();
+ QualType SizeTy = Ctx.getSizeType();
+ uint64_t SizeTySize = Ctx.getTypeSize(SizeTy);
+ for (uint64_t I = 0; I < Size; ++I) {
+ auto *Idx = IntegerLiteral::Create(Ctx, llvm::APInt(SizeTySize, I),
+ SizeTy, SourceLocation());
+ ExprResult ElExpr = S.CreateBuiltinArraySubscriptExpr(
+ E, E->getBeginLoc(), Idx, E->getEndLoc());
+ if (ElExpr.isInvalid())
+ return;
+ BuildIntializerList(S, Ctx, ElExpr.get(), List, DestTypes, ExcessInits);
+ }
+ return;
+ }
+
+ if (auto *RTy = Ty->getAs<RecordType>()) {
+ for (auto *FD : RTy->getDecl()->fields()) {
+ DeclAccessPair Found = DeclAccessPair::make(FD, FD->getAccess());
+ DeclarationNameInfo NameInfo(FD->getDeclName(), E->getBeginLoc());
+ ExprResult Res = S.BuildFieldReferenceExpr(
+ E, false, E->getBeginLoc(), CXXScopeSpec(), FD, Found, NameInfo);
+ if (Res.isInvalid())
+ return;
+ BuildIntializerList(S, Ctx, Res.get(), List, DestTypes, ExcessInits);
+ }
+ }
+}
+
+static Expr *GenerateInitLists(ASTContext &Ctx, QualType Ty,
+ llvm::SmallVectorImpl<Expr *>::iterator &It) {
+ if (Ty->isScalarType()) {
+ return *(It++);
+ }
+ llvm::SmallVector<Expr *> Inits;
+ assert(!isa<MatrixType>(Ty) && "Matrix types not yet supported in HLSL");
+ if (Ty->isVectorType() || Ty->isConstantArrayType()) {
+ QualType ElTy;
+ uint64_t Size = 0;
+ if (auto *ATy = Ty->getAs<VectorType>()) {
+ ElTy = ATy->getElementType();
+ Size = ATy->getNumElements();
+ } else {
+ auto *VTy = cast<ConstantArrayType>(Ty.getTypePtr());
+ ElTy = VTy->getElementType();
+ Size = VTy->getZExtSize();
+ }
+ for (uint64_t I = 0; I < Size; ++I)
+ Inits.push_back(GenerateInitLists(Ctx, ElTy, It));
+ }
+ if (const RecordDecl *RD = Ty->getAsRecordDecl()) {
+ for (auto *FD : RD->fields()) {
+ Inits.push_back(GenerateInitLists(Ctx, FD->getType(), It));
+ }
+ }
+ auto *NewInit = new (Ctx) InitListExpr(Ctx, Inits.front()->getBeginLoc(),
+ Inits, Inits.back()->getEndLoc());
+ NewInit->setType(Ty);
+ return NewInit;
+}
+
+bool SemaHLSL::TransformInitList(const InitializedEntity &Entity,
+ const InitializationKind &Kind,
+ InitListExpr *Init) {
+ // If the initializer is a scalar, just return it.
+ if (Init->getType()->isScalarType())
+ return true;
+ ASTContext &Ctx = SemaRef.getASTContext();
+ llvm::SmallVector<QualType, 16> DestTypes;
+ // An initializer list might be attempting to initialize a reference or
+ // rvalue-reference. When checking the initializer we should look through the
+ // reference.
+ QualType InitTy = Entity.getType().getNonReferenceType();
+ BuildFlattenedTypeList(InitTy, DestTypes);
+
+ llvm::SmallVector<Expr *, 16> ArgExprs;
+ bool ExcessInits = false;
+ for (Expr *Arg : Init->inits())
+ BuildIntializerList(SemaRef, Ctx, Arg, ArgExprs, DestTypes, ExcessInits);
+
+ if (DestTypes.size() != ArgExprs.size() || ExcessInits) {
+ int TooManyOrFew = ExcessInits ? 1 : 0;
+ SemaRef.Diag(Init->getBeginLoc(), diag::err_hlsl_incorrect_num_initializers)
+ << TooManyOrFew << InitTy << DestTypes.size() << ArgExprs.size();
+ return false;
+ }
+
+ auto It = ArgExprs.begin();
+ // GenerateInitLists will always return an InitListExpr here, because the
+ // scalar case is handled above.
+ auto *NewInit = cast<InitListExpr>(GenerateInitLists(Ctx, InitTy, It));
+ Init->resizeInits(Ctx, NewInit->getNumInits());
+ for (unsigned I = 0; I < NewInit->getNumInits(); ++I)
+ Init->updateInit(Ctx, I, NewInit->getInit(I));
+ return true;
+}
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index b95cbbf4222056..a3c56d37f8b0cd 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -26,6 +26,7 @@
#include "clang/Sema/Initialization.h"
#include "clang/Sema/Lookup.h"
#include "clang/Sema/Ownership.h"
+#include "clang/Sema/SemaHLSL.h"
#include "clang/Sema/SemaObjC.h"
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/FoldingSet.h"
@@ -4783,6 +4784,10 @@ static void TryListInitialization(Sema &S,
bool TreatUnavailableAsInvalid) {
QualType DestType = Entity.getType();
+ if (S.getLangOpts().HLSL &&
+ !S.HLSL().TransformInitList(Entity, Kind, InitList))
+ return;
+
// C++ doesn't allow scalar initialization with more than one argument.
// But C99 complex numbers are scalars and it makes sense there.
if (S.getLangOpts().CPlusPlus && DestType->isScalarType() &&
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index e5db7eac37a428..91a283554459d9 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -1,3 +1,4 @@
+
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s
void fn(float x[2]) { }
@@ -27,7 +28,7 @@ void fn2(Obj O[4]) { }
// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[Arr]], i32 32, i1 false)
// CHECK: call void {{.*}}fn2{{.*}}(ptr noundef byval([4 x %struct.Obj]) align 4 [[Tmp]])
void call2() {
- Obj Arr[4] = {};
+ Obj Arr[4] = {{0, 0}, {0, 0}, {0, 0}, {0, 0}};
fn2(Arr);
}
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
new file mode 100644
index 00000000000000..e57724b0ec31fb
--- /dev/null
+++ b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
@@ -0,0 +1,714 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-library -disable-llvm-passes -emit-llvm -finclude-default-header -o - %s | FileCheck %s
+
+struct TwoFloats {
+ float X, Y;
+};
+
+struct TwoInts {
+ int Z, W;
+};
+
+struct Doggo {
+ int4 LegState;
+ int TailState;
+ float HairCount;
+ float4 EarDirection[2];
+};
+
+struct AnimalBits {
+ int Legs[4];
+ uint State;
+ int64_t Counter;
+ float4 LeftDir;
+ float4 RightDir;
+};
+
+struct Kitteh {
+ int4 Legs;
+ int TailState;
+ float HairCount;
+ float4 Claws[2];
+};
+
+struct Zoo {
+ Doggo Dogs[2];
+ Kitteh Cats[4];
+};
+
+// Case 1: Extraneous braces get ignored in literal instantiation.
+// CHECK-LABEL: define void @_Z5case1v(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case1v.TF1, i32 8, i1 false)
+// CHECK-NEXT: ret void
+//
+TwoFloats case1() {
+ TwoFloats TF1 = {{{1.0, 2}}};
+ return TF1;
+}
+
+// Case 2: Valid C/C++ initializer is handled appropriately.
+// CHECK-LABEL: define void @_Z5case2v(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case2v.TF2, i32 8, i1 false)
+// CHECK-NEXT: ret void
+//
+TwoFloats case2() {
+ TwoFloats TF2 = {1, 2};
+ return TF2;
+}
+
+// Case 3: Simple initialization with conversion of an argument.
+// CHECK-LABEL: define void @_Z5case3i(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], i32 noundef [[VAL:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[VAL_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT: store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP0]] to float
+// CHECK-NEXT: store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT: [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: store float 2.000000e+00, ptr [[Y]], align 4
+// CHECK-NEXT: ret void
+//
+TwoFloats case3(int Val) {
+ TwoFloats TF3 = {Val, 2};
+ return TF3;
+}
+
+// Case 4: Initialization from a scalarized vector into a structure with element
+// conversions.
+// CHECK-LABEL: define void @_Z5case4Dv2_i(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
+// CHECK-NEXT: store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
+// CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[VECEXT]] to float
+// CHECK-NEXT: store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT: [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
+// CHECK-NEXT: [[CONV2:%.*]] = sitofp i32 [[VECEXT1]] to float
+// CHECK-NEXT: store float [[CONV2]], ptr [[Y]], align 4
+// CHECK-NEXT: ret void
+//
+TwoFloats case4(int2 TwoVals) {
+ TwoFloats TF4 = {TwoVals};
+ return TF4;
+}
+
+// Case 5: Initialization from a scalarized vector of matching type.
+// CHECK-LABEL: define void @_Z5case5Dv2_i(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
+// CHECK-NEXT: store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
+// CHECK-NEXT: store i32 [[VECEXT]], ptr [[Z]], align 4
+// CHECK-NEXT: [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
+// CHECK-NEXT: [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
+// CHECK-NEXT: store i32 [[VECEXT1]], ptr [[W]], align 4
+// CHECK-NEXT: ret void
+//
+TwoInts case5(int2 TwoVals) {
+ TwoInts TI1 = {TwoVals};
+ return TI1;
+}
+
+// Case 6: Initialization from a scalarized structure of different type with
+// different element types.
+// CHECK-LABEL: define void @_Z5case69TwoFloats(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT: [[CONV:%.*]] = fptosi float [[TMP0]] to i32
+// CHECK-NEXT: store i32 [[CONV]], ptr [[Z]], align 4
+// CHECK-NEXT: [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load float, ptr [[Y]], align 4
+// CHECK-NEXT: [[CONV1:%.*]] = fptosi float [[TMP1]] to i32
+// CHECK-NEXT: store i32 [[CONV1]], ptr [[W]], align 4
+// CHECK-NEXT: ret void
+//
+TwoInts case6(TwoFloats TF4) {
+ TwoInts TI2 = {TF4};
+ return TI2;
+}
+
+// Case 7: Initialization of a complex structue, with bogus braces and element
+// conversions from a collection of scalar values, and structures.
+// CHECK-LABEL: define void @_Z5case77TwoIntsS_i9TwoFloatsS0_S0_S0_(
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_DOGGO:%.*]]) align 16 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOINTS:%.*]]) align 4 [[TI1:%.*]], ptr noundef byval([[STRUCT_TWOINTS]]) align 4 [[TI2:%.*]], i32 noundef [[VAL:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF1:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF2:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF3:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[VAL_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT: store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[LEGSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 0
+// CHECK-NEXT: [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 0
+// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Z]], align 4
+// CHECK-NEXT: [[VECINIT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i32 0
+// CHECK-NEXT: [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 1
+// CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[W]], align 4
+// CHECK-NEXT: [[VECINIT1:%.*]] = insertelement <4 x i32> [[VECINIT]], i32 [[TMP1]], i32 1
+// CHECK-NEXT: [[Z2:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 0
+// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[Z2]], align 4
+// CHECK-NEXT: [[VECINIT3:%.*]] = insertelement <4 x i32> [[VECINIT1]], i32 [[TMP2]], i32 2
+// CHECK-NEXT: [[W4:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 1
+// CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[W4]], align 4
+// CHECK-NEXT: [[VECINIT5:%.*]] = insertelement <4 x i32> [[VECINIT3]], i32 [[TMP3]], i32 3
+// CHECK-NEXT: store <4 x i32> [[VECINIT5]], ptr [[LEGSTATE]], align 16
+// CHECK-NEXT: [[TAILSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 1
+// CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: store i32 [[TMP4]], ptr [[TAILSTATE]], align 16
+// CHECK-NEXT: [[HAIRCOUNT:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 2
+// CHECK-NEXT: [[TMP5:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
+// CHECK-NEXT: [[CONV:%.*]] = sitofp i32 [[TMP5]] to float
+// CHECK-NEXT: store float [[CONV]], ptr [[HAIRCOUNT]], align 4
+// CHECK-NEXT: [[EARDIRECTION:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 3
+// CHECK-NEXT: [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF1]], i32 0, i32 0
+// CHECK-NEXT: [[TMP6:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT: ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
A couple notes for myself:
|
clang/lib/Sema/SemaHLSL.cpp
Outdated
llvm::SmallVector<Expr *, 16> ArgExprs; | ||
bool ExcessInits = false; | ||
for (Expr *Arg : Init->inits()) | ||
BuildIntializerList(SemaRef, Ctx, Arg, ArgExprs, DestTypes, ExcessInits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this can fail if S.CreateBuiltinArraySubscriptExpr
or S.BuildFieldReferenceExpr
returns an invalid expr? Do we care if that is the case? Or if CastInitializer
fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any of those cases fail, they will generate errors, which will (eventually) stop processing. We can continue through all the arguments though and let the error be surfaced after the full initializer expression is processed.
note to self: https://hlsl.godbolt.org/z/1Tq59r89W |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few typos and questions, otherwise looks good!
Typos in title and description: intialization, implementaiton
34124c6
to
bbcf82b
Compare
clang/lib/Sema/SemaChecking.cpp
Outdated
// precision loss, because C++11 narrowing already handles it. | ||
bool IsListInit = Item.IsListInit || | ||
(isa<InitListExpr>(OrigE) && S.getLangOpts().CPlusPlus); | ||
// precision loss, because C++11 narrowing already handles it. HLSL's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HLSL comment is completely different from the c++ narrowing comment. Feel like it should be on its own line so as not to communicate that they are related.
I did find a bug in this for handling data structures with resources... I'm working on it and will update the PR when I have a fix. |
|
||
if (DestTypes.size() != ArgExprs.size() || ExcessInits) { | ||
int TooManyOrFew = ExcessInits ? 1 : 0; | ||
SemaRef.Diag(Init->getBeginLoc(), diag::err_hlsl_incorrect_num_initializers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note_ovl_candidate_bad_list_argument
note does almost everything you want except error you are returning a false here anyways, do we need a special diagnostic for this? If so does it need to be hlsl specifc. seems like it could be usefull in other cases maybe replace err_excess_initializers
with one that can say too few
or too many
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In basically all C-based languages "too few" initializers isn't an error, they're just assumed to default to 0 or default initialization. note_ovl_candidate_bad_list_argument
is close, but it is a note, not an error. We could potentially share diagnostic strings, but I'm not sure how useful that is.
The main reason I didn't extend err_excess_initializers
, is because this is definitely an HLSL feature that is on the chopping block. It's hugely problematic and irregular, and it'll be easier to remove support for it someday in the future if it has a unique diagnostic.
That is not strictly a requirement, but boy do I want to burn this feature with maximum prejudice... if only it were not relied on so much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting.
I wonder though what it looks like in the AST when initializing something with the result of a function call that return an aggregate that must be split up across different boundaries? Or what about expressions with side-effects? How does it not duplicate sub-expressions while also not creating temporary variables to capture results?
Shouldn't there be an AST test so we can verify the way it transforms the code?
struct Foo {
float f1;
float f2;
};
Foo makeFoo() {
Foo foo = {1, 2};
return foo;
}
struct TwoFoo {
Foo foo1, foo2;
};
TwoFoo testSplitFoo() {
TwoFoo twoFoo = {0, makeFoo(), 3};
return twoFoo;
}
The way I thought of to comprehensively solve this in a clear general way was to:
Or perhaps that can be wrapped in a helper function instead. |
I think the simpler approach is to wrap any expression that can possibly side-effect in an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks reasonable to me. A couple of nitpicks and a question inline.
clang/lib/Sema/SemaHLSL.cpp
Outdated
if (Ty->isScalarType() || (Ty->isRecordType() && !Ty->isAggregateType())) | ||
return CastInitializer(S, Ctx, E, List, DestTypes); | ||
|
||
if (auto *ATy = Ty->getAs<VectorType>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find these abbreviations a bit confusing. ATy
is a vector type? VTy
is an array type? Might be best to use longer names here.
if (getLangOpts().HLSL) | ||
data().Aggregate = data().UserDeclaredSpecialMembers == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this comment match what this is doing? If I'm reading this correctly we're using the fact that there are no user declared constructors/destructors/assignment operators to decide that this is an aggregate - I guess that's because users aren't allowed to write those in HLSL so we can use it as a proxy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea... I'll update the comment too. We can't strictly use implicit
because some of the implicit types do behave like aggregates.
This PR implements HLSL's initialization list behvaior as specified in the draft language specifcation under [*Decl.Init.Agg*](https://microsoft.github.io/hlsl-specs/specs/hlsl.html #Decl.Init.Agg). This behavior is a bit unusual for C/C++ because intermediate braces in initializer lists are ignored and a whole array of additional conversions occur unintuitively to how initializaiton works in C. The implementaiton in this PR generates a valid C/C++ initialization list AST for the HLSL initializer so that there are no changes required to Clang's CodeGen to support this. This design will also allow us to use Clang's rewrite to convert HLSL initializers to valid C/C++ initializers that are equivalent. It does have the downside that it will generate often redundant accesses during codegen. The IR optimizer is extremely good at eliminating those so this will have no impact on the final executable performance. There is some opportunity for optimizing the initializer list generation that we could consider in subsequent commits. One notable opportunity would be to identify aggregate objects that occur in the same place in both initializers and do not require converison, those aggregates could be initialized as aggregates rather than fully scalarized. Closes llvm#56067
Doh! Co-authored-by: Finn Plummer <[email protected]>
I swear I can spell... Co-authored-by: Helena Kotas <[email protected]>
Co-authored-by: Finn Plummer <[email protected]>
More updates coming to handle additional PR review.
Also added tests for bitfield members to verify correct code generation for initializing bitfield members or initializing new objects from bitfields. ../clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
Co-authored-by: Helena Kotas <[email protected]>
This was a bit tricker than I expected, but I think I came up with a reasonably clever solution. In HLSL, user-defined data types have aggregate initialization, not constructors _except_ that some of the builtin types we do model constructors for. This is actually useful! In my earlier change I updated DeclCXX so that HLSL non-implicit classes are always marged as Aggregates. This ends up being not quite right, because there are some implicit types that should be aggregates, and others that shouldn't. For example, the implicit cbuffer-layout types should be aggregates so that we can flatten them, but resources shouldn't be because we really don't want to flatten them. In the update, whether or not an HLSL type is an aggregate is keyed off having non-implicit "special" members (constructors & operators). This is more correct. Aggregate types get flattened out for casting and initialization, while non-Aggregate types (basically just resources) get left unflattened and we attempt copy-initialization on them. Next problem: I was getting some odd conflicting diagnostics when argument conversion or copy-initialization fails, so I refactored BuildInitializerList and CastInitializer to return success/failure so that we can propagate that up and fail if the argument->destination type fails without then also complaining about the number of initializers.
This slightly tweaks the AST formulation to wrap any potentially side-effecting initialization list expression in an OpaqueValueExpr so that during codegen we can ensure it only gets emitted once. I've made a slightly hacky but minimal change in CodeGen to then emit OpaqueValueExprs in InitLists during aggregate initialization emission. I weighed the tradeoff between an AST-level representation or this CodeGen change. If HLSL were going to retain this initialization list behavior long term, I'd probably add a new AST node to represent HLSL initialization lists and restructure how we generate the AST, but I think that would be a lot more code to maintain, and since the goal is to remove this quirk from the language I don't think it is the best solution. The change as written isolates most of the weirdness of HLSL in CGHLSLRuntime and is a massively smaller change than a new AST node for initialization lists. This will also ensure that Clang's AST-based analysis for initialization lists will continue to be accurate for HLSL without any additional updates required.
Co-authored-by: Justin Bogner <[email protected]>
../clang/test/SemaHLSL/Language/ElementwiseCast-errors.hlsl
00a6adc
to
70bb063
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/6180 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/72/builds/8263 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/8747 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/145/builds/5160 Here is the relevant piece of the build log for the reference
|
This PR implements HLSL's initialization list behvaior as specified in the draft language specifcation under [*Decl.Init.Agg*](https://microsoft.github.io/hlsl-specs/specs/hlsl.html#Decl.Init.Agg). This behavior is a bit unusual for C/C++ because intermediate braces in initializer lists are ignored and a whole array of additional conversions occur unintuitively to how initializaiton works in C. The implementaiton in this PR generates a valid C/C++ initialization list AST for the HLSL initializer so that there are no changes required to Clang's CodeGen to support this. This design will also allow us to use Clang's rewrite to convert HLSL initializers to valid C/C++ initializers that are equivalent. It does have the downside that it will generate often redundant accesses during codegen. The IR optimizer is extremely good at eliminating those so this will have no impact on the final executable performance. There is some opportunity for optimizing the initializer list generation that we could consider in subsequent commits. One notable opportunity would be to identify aggregate objects that occur in the same place in both initializers and do not require converison, those aggregates could be initialized as aggregates rather than fully scalarized. Closes llvm#56067 --------- Co-authored-by: Finn Plummer <[email protected]> Co-authored-by: Helena Kotas <[email protected]> Co-authored-by: Justin Bogner <[email protected]>
This PR implements HLSL's initialization list behvaior as specified in the draft language specifcation under
Decl.Init.Agg.
This behavior is a bit unusual for C/C++ because intermediate braces in initializer lists are ignored and a whole array of additional conversions occur unintuitively to how initializaiton works in C.
The implementaiton in this PR generates a valid C/C++ initialization list AST for the HLSL initializer so that there are no changes required to Clang's CodeGen to support this. This design will also allow us to use Clang's rewrite to convert HLSL initializers to valid C/C++ initializers that are equivalent. It does have the downside that it will generate often redundant accesses during codegen. The IR optimizer is extremely good at eliminating those so this will have no impact on the final executable performance.
There is some opportunity for optimizing the initializer list generation that we could consider in subsequent commits. One notable opportunity would be to identify aggregate objects that occur in the same place in both initializers and do not require converison, those aggregates could be initialized as aggregates rather than fully scalarized.
Closes #56067