-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[HLSL][DirectX] Use a padding type for HLSL buffers. #167404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-backend-spir-v @llvm/pr-subscribers-llvm-ir Author: Justin Bogner (bogner) ChangesThis change drops the use of the "Layout" type and instead uses explicit padding throughout the compiler to represent types in HLSL buffers. There are a few parts to this, though it's difficult to split them up as they're very interdependent:
Fixes several issues, including #138996, #144573, and #156084. Patch is 167.65 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/167404.diff 37 Files Affected:
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index a837f00732748..f2451b16e78be 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -4641,11 +4641,17 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E,
LHS.getBaseInfo(), TBAAAccessInfo());
}
- // The HLSL runtime handle the subscript expression on global resource arrays.
- if (getLangOpts().HLSL && (E->getType()->isHLSLResourceRecord() ||
- E->getType()->isHLSLResourceRecordArray())) {
- std::optional<LValue> LV =
- CGM.getHLSLRuntime().emitResourceArraySubscriptExpr(E, *this);
+ // The HLSL runtime handles subscript expressions on global resource arrays
+ // and objects with HLSL buffer layouts.
+ if (getLangOpts().HLSL) {
+ std::optional<LValue> LV;
+ if (E->getType()->isHLSLResourceRecord() ||
+ E->getType()->isHLSLResourceRecordArray()) {
+ LV = CGM.getHLSLRuntime().emitResourceArraySubscriptExpr(E, *this);
+ } else if (E->getType().getAddressSpace() == LangAS::hlsl_constant) {
+ LV = CGM.getHLSLRuntime().emitBufferArraySubscriptExpr(E, *this,
+ EmitIdxAfterBase);
+ }
if (LV.has_value())
return *LV;
}
@@ -5110,6 +5116,11 @@ LValue CodeGenFunction::EmitMemberExpr(const MemberExpr *E) {
EmitIgnoredExpr(E->getBase());
return EmitDeclRefLValue(DRE);
}
+ if (getLangOpts().HLSL &&
+ E->getType().getAddressSpace() == LangAS::hlsl_constant) {
+ // We have an HLSL buffer - emit using HLSL's layout rules.
+ return CGM.getHLSLRuntime().emitBufferMemberExpr(*this, E);
+ }
Expr *BaseExpr = E->getBase();
// Check whether the underlying base pointer is a constant null.
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index eee397f1f3d19..bd7d30b10d4d0 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -2279,6 +2279,10 @@ void CodeGenFunction::EmitAggregateCopy(LValue Dest, LValue Src, QualType Ty,
}
}
+ if (getLangOpts().HLSL && Ty.getAddressSpace() == LangAS::hlsl_constant)
+ if (CGM.getHLSLRuntime().emitBufferCopy(*this, DestPtr, SrcPtr, Ty))
+ return;
+
// Aggregate assignment turns into llvm.memcpy. This is almost valid per
// C99 6.5.16.1p3, which states "If the value being stored in an object is
// read from another object that overlaps in anyway the storage of the first
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bdba9b3da502..c42619541a87b 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -13,10 +13,11 @@
//===----------------------------------------------------------------------===//
#include "CGHLSLRuntime.h"
-#include "Address.h"
#include "CGDebugInfo.h"
+#include "CGRecordLayout.h"
#include "CodeGenFunction.h"
#include "CodeGenModule.h"
+#include "HLSLBufferLayoutBuilder.h"
#include "TargetInfo.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/Attrs.inc"
@@ -26,6 +27,7 @@
#include "clang/AST/Type.h"
#include "clang/Basic/TargetOptions.h"
#include "clang/Frontend/FrontendDiagnostic.h"
+#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/Frontend/HLSL/RootSignatureMetadata.h"
@@ -265,9 +267,9 @@ CGHLSLRuntime::convertHLSLSpecificType(const Type *T,
assert(T->isHLSLSpecificType() && "Not an HLSL specific type!");
// Check if the target has a specific translation for this type first.
- if (llvm::Type *TargetTy =
+ if (llvm::Type *LayoutTy =
CGM.getTargetCodeGenInfo().getHLSLType(CGM, T, OffsetInfo))
- return TargetTy;
+ return LayoutTy;
llvm_unreachable("Generic handling of HLSL types is not supported.");
}
@@ -278,23 +280,18 @@ llvm::Triple::ArchType CGHLSLRuntime::getArch() {
// Emits constant global variables for buffer constants declarations
// and creates metadata linking the constant globals with the buffer global.
-void CGHLSLRuntime::emitBufferGlobalsAndMetadata(const HLSLBufferDecl *BufDecl,
- llvm::GlobalVariable *BufGV) {
+void CGHLSLRuntime::emitBufferGlobalsAndMetadata(
+ const HLSLBufferDecl *BufDecl, llvm::GlobalVariable *BufGV,
+ const CGHLSLOffsetInfo &OffsetInfo) {
LLVMContext &Ctx = CGM.getLLVMContext();
// get the layout struct from constant buffer target type
llvm::Type *BufType = BufGV->getValueType();
- llvm::Type *BufLayoutType =
- cast<llvm::TargetExtType>(BufType)->getTypeParameter(0);
llvm::StructType *LayoutStruct = cast<llvm::StructType>(
- cast<llvm::TargetExtType>(BufLayoutType)->getTypeParameter(0));
+ cast<llvm::TargetExtType>(BufType)->getTypeParameter(0));
- // Start metadata list associating the buffer global variable with its
- // constatns
- SmallVector<llvm::Metadata *> BufGlobals;
- BufGlobals.push_back(ValueAsMetadata::get(BufGV));
-
- const auto *ElemIt = LayoutStruct->element_begin();
+ SmallVector<std::pair<VarDecl *, uint32_t>> DeclsWithOffset;
+ size_t OffsetIdx = 0;
for (Decl *D : BufDecl->buffer_decls()) {
if (isa<CXXRecordDecl, EmptyDecl>(D))
// Nothing to do for this declaration.
@@ -326,14 +323,27 @@ void CGHLSLRuntime::emitBufferGlobalsAndMetadata(const HLSLBufferDecl *BufDecl,
continue;
}
+ DeclsWithOffset.emplace_back(VD, OffsetInfo[OffsetIdx++]);
+ }
+
+ llvm::stable_sort(DeclsWithOffset, [](const auto &LHS, const auto &RHS) {
+ return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second);
+ });
+
+ // Associate the buffer global variable with its constants
+ SmallVector<llvm::Metadata *> BufGlobals;
+ BufGlobals.reserve(DeclsWithOffset.size() + 1);
+ BufGlobals.push_back(ValueAsMetadata::get(BufGV));
+
+ auto ElemIt = LayoutStruct->element_begin();
+ for (auto &[VD, _] : DeclsWithOffset) {
+ if (CGM.getTargetCodeGenInfo().isHLSLPadding(*ElemIt))
+ ++ElemIt;
+
assert(ElemIt != LayoutStruct->element_end() &&
"number of elements in layout struct does not match");
llvm::Type *LayoutType = *ElemIt++;
- // FIXME: handle resources inside user defined structs
- // (llvm/wg-hlsl#175)
-
- // create global variable for the constant and to metadata list
GlobalVariable *ElemGV =
cast<GlobalVariable>(CGM.GetAddrOfGlobalVar(VD, LayoutType));
BufGlobals.push_back(ValueAsMetadata::get(ElemGV));
@@ -410,18 +420,17 @@ void CGHLSLRuntime::addBuffer(const HLSLBufferDecl *BufDecl) {
// create global variable for the constant buffer
CGHLSLOffsetInfo OffsetInfo = CGHLSLOffsetInfo::fromDecl(*BufDecl);
- llvm::TargetExtType *TargetTy = cast<llvm::TargetExtType>(
- convertHLSLSpecificType(ResHandleTy, OffsetInfo));
+ llvm::Type *LayoutTy = convertHLSLSpecificType(ResHandleTy, OffsetInfo);
llvm::GlobalVariable *BufGV = new GlobalVariable(
- TargetTy, /*isConstant*/ false,
- GlobalValue::LinkageTypes::ExternalLinkage, PoisonValue::get(TargetTy),
+ LayoutTy, /*isConstant*/ false,
+ GlobalValue::LinkageTypes::ExternalLinkage, PoisonValue::get(LayoutTy),
llvm::formatv("{0}{1}", BufDecl->getName(),
BufDecl->isCBuffer() ? ".cb" : ".tb"),
GlobalValue::NotThreadLocal);
CGM.getModule().insertGlobalVariable(BufGV);
// Add globals for constant buffer elements and create metadata nodes
- emitBufferGlobalsAndMetadata(BufDecl, BufGV);
+ emitBufferGlobalsAndMetadata(BufDecl, BufGV, OffsetInfo);
// Initialize cbuffer from binding (implicit or explicit)
initializeBufferFromBinding(BufDecl, BufGV);
@@ -440,7 +449,7 @@ void CGHLSLRuntime::addRootSignature(
SignatureDecl->getRootElements(), nullptr, M);
}
-llvm::TargetExtType *
+llvm::StructType *
CGHLSLRuntime::getHLSLBufferLayoutType(const RecordType *StructType) {
const auto Entry = LayoutTypes.find(StructType);
if (Entry != LayoutTypes.end())
@@ -449,7 +458,7 @@ CGHLSLRuntime::getHLSLBufferLayoutType(const RecordType *StructType) {
}
void CGHLSLRuntime::addHLSLBufferLayoutType(const RecordType *StructType,
- llvm::TargetExtType *LayoutTy) {
+ llvm::StructType *LayoutTy) {
assert(getHLSLBufferLayoutType(StructType) == nullptr &&
"layout type for this struct already exist");
LayoutTypes[StructType] = LayoutTy;
@@ -1101,3 +1110,235 @@ std::optional<LValue> CGHLSLRuntime::emitResourceArraySubscriptExpr(
}
return CGF.MakeAddrLValue(TmpVar, ResultTy, AlignmentSource::Decl);
}
+
+std::optional<LValue> CGHLSLRuntime::emitBufferArraySubscriptExpr(
+ const ArraySubscriptExpr *E, CodeGenFunction &CGF,
+ llvm::function_ref<llvm::Value *(bool Promote)> EmitIdxAfterBase) {
+ // Find the element type to index by first padding the element type per HLSL
+ // buffer rules, and then padding out to a 16-byte register boundary if
+ // necessary.
+ llvm::Type *LayoutTy =
+ HLSLBufferLayoutBuilder(CGF.CGM).layOutType(E->getType());
+ uint64_t LayoutSizeInBits =
+ CGM.getDataLayout().getTypeSizeInBits(LayoutTy).getFixedValue();
+ CharUnits ElementSize = CharUnits::fromQuantity(LayoutSizeInBits / 8);
+ CharUnits RowAlignedSize = ElementSize.alignTo(CharUnits::fromQuantity(16));
+ if (RowAlignedSize > ElementSize) {
+ llvm::Type *Padding = CGM.getTargetCodeGenInfo().getHLSLPadding(
+ CGM, RowAlignedSize - ElementSize);
+ assert(Padding && "No padding type for target?");
+ LayoutTy = llvm::StructType::get(CGF.getLLVMContext(), {LayoutTy, Padding},
+ /*isPacked=*/true);
+ }
+
+ // If the layout type doesn't introduce any padding, we don't need to do
+ // anything special.
+ llvm::Type *OrigTy = CGF.CGM.getTypes().ConvertTypeForMem(E->getType());
+ if (LayoutTy == OrigTy)
+ return std::nullopt;
+
+ LValueBaseInfo EltBaseInfo;
+ TBAAAccessInfo EltTBAAInfo;
+ Address Addr =
+ CGF.EmitPointerWithAlignment(E->getBase(), &EltBaseInfo, &EltTBAAInfo);
+ llvm::Value *Idx = EmitIdxAfterBase(/*Promote*/ true);
+
+ // Index into the object as-if we have an array of the padded element type,
+ // and then dereference the element itself to avoid reading padding that may
+ // be past the end of the in-memory object.
+ SmallVector<llvm::Value *, 2> Indices;
+ Indices.push_back(Idx);
+ Indices.push_back(llvm::ConstantInt::get(Indices[0]->getType(), 0));
+
+ llvm::Value *GEP = CGF.Builder.CreateGEP(LayoutTy, Addr.emitRawPointer(CGF),
+ Indices, "cbufferidx");
+ Addr = Address(GEP, Addr.getElementType(), RowAlignedSize, KnownNonNull);
+
+ return CGF.MakeAddrLValue(Addr, E->getType(), EltBaseInfo, EltTBAAInfo);
+}
+
+namespace {
+/// Utility for emitting copies following the HLSL buffer layout rules (ie,
+/// copying out of a cbuffer).
+class HLSLBufferCopyEmitter {
+ CodeGenFunction &CGF;
+ Address DestPtr;
+ Address SrcPtr;
+ llvm::Type *LayoutTy = nullptr;
+
+ SmallVector<llvm::Value *> CurStoreIndices;
+ SmallVector<llvm::Value *> CurLoadIndices;
+
+ void emitCopyAtIndices(llvm::Type *FieldTy, llvm::ConstantInt *StoreIndex,
+ llvm::ConstantInt *LoadIndex) {
+ CurStoreIndices.push_back(StoreIndex);
+ CurLoadIndices.push_back(LoadIndex);
+ auto RestoreIndices = llvm::make_scope_exit([&]() {
+ CurStoreIndices.pop_back();
+ CurLoadIndices.pop_back();
+ });
+
+ // First, see if this is some kind of aggregate and recurse.
+ if (processArray(FieldTy))
+ return;
+ if (processBufferLayoutArray(FieldTy))
+ return;
+ if (processStruct(FieldTy))
+ return;
+
+ // When we have a scalar or vector element we can emit the copy.
+ CharUnits Align = CharUnits::fromQuantity(
+ CGF.CGM.getDataLayout().getABITypeAlign(FieldTy));
+ Address SrcGEP = RawAddress(
+ CGF.Builder.CreateInBoundsGEP(LayoutTy, SrcPtr.getBasePointer(),
+ CurLoadIndices, "cbuf.src"),
+ FieldTy, Align, SrcPtr.isKnownNonNull());
+ Address DestGEP = CGF.Builder.CreateInBoundsGEP(
+ DestPtr, CurStoreIndices, FieldTy, Align, "cbuf.dest");
+ llvm::Value *Load = CGF.Builder.CreateLoad(SrcGEP, "cbuf.load");
+ CGF.Builder.CreateStore(Load, DestGEP);
+ }
+
+ bool processArray(llvm::Type *FieldTy) {
+ auto *AT = dyn_cast<llvm::ArrayType>(FieldTy);
+ if (!AT)
+ return false;
+
+ // If we have an llvm::ArrayType this is just a regular array with no top
+ // level padding, so all we need to do is copy each member.
+ for (unsigned I = 0, E = AT->getNumElements(); I < E; ++I)
+ emitCopyAtIndices(AT->getElementType(),
+ llvm::ConstantInt::get(CGF.SizeTy, I),
+ llvm::ConstantInt::get(CGF.SizeTy, I));
+ return true;
+ }
+
+ bool processBufferLayoutArray(llvm::Type *FieldTy) {
+ // A buffer layout array is a struct with two elements: the padded array,
+ // and the last element. That is, is should look something like this:
+ //
+ // { [%n x { %type, %padding }], %type }
+ //
+ auto *ST = dyn_cast<llvm::StructType>(FieldTy);
+ if (!ST || ST->getNumElements() != 2)
+ return false;
+
+ auto *PaddedEltsTy = dyn_cast<llvm::ArrayType>(ST->getElementType(0));
+ if (!PaddedEltsTy)
+ return false;
+
+ auto *PaddedTy = dyn_cast<llvm::StructType>(PaddedEltsTy->getElementType());
+ if (!PaddedTy || PaddedTy->getNumElements() != 2)
+ return false;
+
+ if (!CGF.CGM.getTargetCodeGenInfo().isHLSLPadding(
+ PaddedTy->getElementType(1)))
+ return false;
+
+ llvm::Type *ElementTy = ST->getElementType(1);
+ if (PaddedTy->getElementType(0) != ElementTy)
+ return false;
+
+ // All but the last of the logical array elements are in the padded array.
+ unsigned NumElts = PaddedEltsTy->getNumElements() + 1;
+
+ // Add an extra indirection to the load for the struct and walk the
+ // array prefix.
+ CurLoadIndices.push_back(llvm::ConstantInt::get(CGF.Int32Ty, 0));
+ for (unsigned I = 0; I < NumElts - 1; ++I) {
+ // We need to copy the element itself, without the padding.
+ CurLoadIndices.push_back(llvm::ConstantInt::get(CGF.SizeTy, I));
+ emitCopyAtIndices(ElementTy, llvm::ConstantInt::get(CGF.SizeTy, I),
+ llvm::ConstantInt::get(CGF.Int32Ty, 0));
+ CurLoadIndices.pop_back();
+ }
+ CurLoadIndices.pop_back();
+
+ // Now copy the last element.
+ emitCopyAtIndices(ElementTy,
+ llvm::ConstantInt::get(CGF.SizeTy, NumElts - 1),
+ llvm::ConstantInt::get(CGF.Int32Ty, 1));
+
+ return true;
+ }
+
+ bool processStruct(llvm::Type *FieldTy) {
+ auto *ST = dyn_cast<llvm::StructType>(FieldTy);
+ if (!ST)
+ return false;
+
+ // Copy the struct field by field, but skip any explicit padding.
+ unsigned Skipped = 0;
+ for (unsigned I = 0, E = ST->getNumElements(); I < E; ++I) {
+ llvm::Type *ElementTy = ST->getElementType(I);
+ if (CGF.CGM.getTargetCodeGenInfo().isHLSLPadding(ElementTy))
+ ++Skipped;
+ else
+ emitCopyAtIndices(ElementTy, llvm::ConstantInt::get(CGF.Int32Ty, I),
+ llvm::ConstantInt::get(CGF.Int32Ty, I + Skipped));
+ }
+ return true;
+ }
+
+public:
+ HLSLBufferCopyEmitter(CodeGenFunction &CGF, Address DestPtr, Address SrcPtr)
+ : CGF(CGF), DestPtr(DestPtr), SrcPtr(SrcPtr) {}
+
+ bool emitCopy(QualType CType) {
+ LayoutTy = HLSLBufferLayoutBuilder(CGF.CGM).layOutType(CType);
+
+ // TODO: We should be able to fall back to a regular memcpy if the layout
+ // type doesn't have any padding, but that runs into issues in the backend
+ // currently.
+ //
+ // See https://github.com/llvm/wg-hlsl/issues/351
+ emitCopyAtIndices(LayoutTy, llvm::ConstantInt::get(CGF.SizeTy, 0),
+ llvm::ConstantInt::get(CGF.SizeTy, 0));
+ return true;
+ }
+};
+} // namespace
+
+bool CGHLSLRuntime::emitBufferCopy(CodeGenFunction &CGF, Address DestPtr,
+ Address SrcPtr, QualType CType) {
+ return HLSLBufferCopyEmitter(CGF, DestPtr, SrcPtr).emitCopy(CType);
+}
+
+LValue CGHLSLRuntime::emitBufferMemberExpr(CodeGenFunction &CGF,
+ const MemberExpr *E) {
+ LValue Base =
+ CGF.EmitCheckedLValue(E->getBase(), CodeGenFunction::TCK_MemberAccess);
+ auto *Field = dyn_cast<FieldDecl>(E->getMemberDecl());
+ assert(Field && "Unexpected access into HLSL buffer");
+
+ // Get the field index for the struct.
+ const RecordDecl *Rec = Field->getParent();
+ unsigned FieldIdx =
+ CGM.getTypes().getCGRecordLayout(Rec).getLLVMFieldNo(Field);
+
+ // Work out the buffer layout type to index into.
+ QualType RecType = CGM.getContext().getCanonicalTagType(Rec);
+ assert(RecType->isStructureOrClassType() && "Invalid type in HLSL buffer");
+ // Since this is a member of the buffer and not the buffer itself, we
+ // shouldn't have any offsets we need to contend with.
+ CGHLSLOffsetInfo EmptyOffsets;
+ llvm::StructType *LayoutTy = HLSLBufferLayoutBuilder(CGM).layOutStruct(
+ RecType->getAsCanonical<RecordType>(), EmptyOffsets);
+
+ // Now index into the struct, making sure that the type we return is the
+ // buffer layout type rather than the original type in the AST.
+ QualType FieldType = Field->getType();
+ llvm::Type *FieldLLVMTy = CGM.getTypes().ConvertTypeForMem(FieldType);
+ CharUnits Align = CharUnits::fromQuantity(
+ CGF.CGM.getDataLayout().getABITypeAlign(FieldLLVMTy));
+ Address Addr(CGF.Builder.CreateStructGEP(LayoutTy, Base.getPointer(CGF),
+ FieldIdx, Field->getName()),
+ FieldLLVMTy, Align, KnownNonNull);
+
+ LValue LV = LValue::MakeAddr(Addr, FieldType, CGM.getContext(),
+ LValueBaseInfo(AlignmentSource::Type),
+ CGM.getTBAAAccessInfo(FieldType));
+ LV.getQuals().addCVRQualifiers(Base.getVRQualifiers());
+
+ return LV;
+}
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index 488a322ca7569..07d6ab87305bb 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -15,20 +15,19 @@
#ifndef LLVM_CLANG_LIB_CODEGEN_CGHLSLRUNTIME_H
#define LLVM_CLANG_LIB_CODEGEN_CGHLSLRUNTIME_H
-#include "llvm/ADT/DenseMap.h"
-#include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/Intrinsics.h"
-#include "llvm/IR/IntrinsicsDirectX.h"
-#include "llvm/IR/IntrinsicsSPIRV.h"
-
+#include "Address.h"
#include "clang/AST/Attr.h"
#include "clang/AST/Decl.h"
#include "clang/Basic/Builtins.h"
#include "clang/Basic/HLSLRuntime.h"
-
+#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Frontend/HLSL/HLSLResource.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/IntrinsicsDirectX.h"
+#include "llvm/IR/IntrinsicsSPIRV.h"
#include <optional>
#include <vector>
@@ -100,6 +99,11 @@ class CGHLSLOffsetInfo {
/// of the HLSL buffer after all of the elements with specified offset.
static CGHLSLOffsetInfo fromDecl(const HLSLBufferDecl &BufDecl);
+ /// Comparison function for offsets received from `operator[]` suitable for
+ /// use in a `stable_sort`. This will order implicit bindings after explicit
+ /// offsets.
+ static bool compareOffsets(uint32_t LHS, uint32_t RHS) { return LHS < RHS; }
+
/// Get the given offset, or `~0U` if there is no offset for the member.
uint32_t operator[](size_t I) const {
if (Offsets.empty())
@@ -214,19 +218,28 @@ class CGHLSLRuntime {
llvm::Instruction *getConvergenceToken(llvm::BasicBlock &BB);
- llvm::TargetExtType *
- getHLSLBufferLayoutType(const RecordType *LayoutStructTy);
+ llvm::StructType *getHLSLBufferLayoutType(const RecordType *LayoutStructTy);
void addHLSLBufferLayoutType(const RecordType *LayoutStructTy,
- llvm::TargetExtType *LayoutTy);
+ llvm::StructType *LayoutTy);
void emitInitListOpaqueValues(CodeGenFunction &CGF, InitListExpr *E);
...
[truncated]
|
This change drops the use of the "Layout" type and instead uses explicit padding throughout the compiler to represent types in HLSL buffers. There are a few parts to this, though it's difficult to split them up as they're very interdependent: 1. Refactor HLSLBufferLayoutBuilder to allow us to calculate the padding of arbitrary types. 2. Teach Clang CodeGen to use HLSL specific paths for cbuffers when generating aggregate copies, array accesses, and structure accesses. 3. Simplify DXILCBufferAccesses such that it directly replaces accesses with dx.resource.getpointer rather than recalculating the layout. 4. Basic infrastructure for SPIR-V handling, but the implementation itself will need work in follow ups. Fixes several issues, including llvm#138996, llvm#144573, and llvm#156084. Resolves llvm#147352.
e974162 to
c4a9c64
Compare
| use(s3.c2.a1); | ||
| // CHECK: load <2 x float>, ptr addrspace(2) getelementptr (<{ %A, target("dx.Padding", 8) }>, ptr addrspace(2) @s4, i32 2, i32 0), align 8 | ||
| use(s4[2].a1); | ||
| // CHECK: load <3 x i16>, ptr addrspace(2) getelementptr inbounds nuw (%B, ptr addrspace(2) getelementptr (<{ %B, target("dx.Padding", 2) }>, ptr addrspace(2) getelementptr (<{ <{ [5 x <{ %B, target("dx.Padding", 2) }>], %B }>, target("dx.Padding", 2) }>, ptr addrspace(2) @s5, i32 3, i32 0), i32 5, i32 0), i32 0, i32 1), align 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to see what the spir-v backend does with this. I was expect the array sizes to be part of the GEPs, but it is not. A direct translation in SPIR-V would use the OpPtrAccessChain intructions, but they are not available in shaders.
I'll have to see how we can translate that. Also, I'm not sure how this will work for the structured GEP that @Keenuts is working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this taking shape! I just reviewed the Clang part so far and it is looking good - just a few comments.
| /// Lays out a struct type following HLSL buffer rules and considering | ||
| /// PackOffsets, if provided. Previously created layout structs are cached by | ||
| /// CGHLSLRuntime. | ||
| /// | ||
| /// The function iterates over all fields of the record type (including base | ||
| /// classes) and calls layoutField to converts each field to its corresponding | ||
| /// LLVM type and to calculate its HLSL constant buffer layout. Any embedded | ||
| /// structs (or arrays of structs) are converted to layout types as well. | ||
| /// | ||
| /// When PackOffsets are specified the elements will be placed based on the | ||
| /// user-specified offsets. Not all elements must have a | ||
| /// packoffset/register(c#) annotation though. For those that don't, the | ||
| /// PackOffsets array will contain -1 value instead. These elements must be | ||
| /// placed at the end of the layout after all of the elements with specific | ||
| /// offset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment needs to be updated. It talks about PackOffsets, but the code uses OffsetInfo. And layoutField function does not exist.
| if (llvm::Type *TargetTy = | ||
| if (llvm::Type *LayoutTy = | ||
| CGM.getTargetCodeGenInfo().getHLSLType(CGM, T, OffsetInfo)) | ||
| return TargetTy; | ||
| return LayoutTy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these should be renamed. All resource types go thought this.
| // TODO: We should have target("dx.Padding", 2) padding after %B, but we don't correctly handle | ||
| // 2- and 3-element vectors inside structs yet because of DataLayout rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an issue for this? It might be worth adding a link to it here.
| // INLINE-NEXT-SAME: @"llvm.dx.resource.handlefromimplicitbinding.tdx.CBuffer_tdx.Layout_s___cblayout_$Globalss_4_0tt"(i32 0, i32 0, i32 1, i32 0, i1 false) | ||
| // INLINE-NEXT: store target("dx.CBuffer", target("dx.Layout", %"__cblayout_$Globals", 4, 0)) %[[HANDLE]], ptr @"$Globals.cb", align 4 | ||
| // INLINE-NEXT: %[[HANDLE:.*]] = call target("dx.CBuffer", %"__cblayout_$Globals") | ||
| // INLINE-NEXT-SAME: @"llvm.dx.resource.handlefromimplicitbinding.tdx.CBuffer_s___cblayout_$Globalsst"(i32 0, i32 0, i32 1, i32 0, i1 false, ptr @"$Globals.str") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // INLINE-NEXT-SAME: @"llvm.dx.resource.handlefromimplicitbinding.tdx.CBuffer_s___cblayout_$Globalsst"(i32 0, i32 0, i32 1, i32 0, i1 false, ptr @"$Globals.str") | |
| // INLINE-NEXT-SAME: @"llvm.dx.resource.handlefromimplicitbinding.tdx.CBuffer_s___cblayout_$Globalsst"(i32 0, i32 0, i32 1, i32 0, ptr @"$Globals.str") |
| llvm::stable_sort(DeclsWithOffset, [](const auto &LHS, const auto &RHS) { | ||
| return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the vast majority of cbuffers does not use packoffset, let alone in the way that changes the order of the members, I am wondering if we could skip some of the work:
| llvm::stable_sort(DeclsWithOffset, [](const auto &LHS, const auto &RHS) { | |
| return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second); | |
| }); | |
| if (!OffsetInfo.empty()) | |
| llvm::stable_sort(DeclsWithOffset, [](const auto &LHS, const auto &RHS) { | |
| return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second); | |
| }); |
| // be past the end of the in-memory object. | ||
| SmallVector<llvm::Value *, 2> Indices; | ||
| Indices.push_back(Idx); | ||
| Indices.push_back(llvm::ConstantInt::get(Indices[0]->getType(), 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Indices.push_back(llvm::ConstantInt::get(Indices[0]->getType(), 0)); | |
| Indices.push_back(llvm::ConstantInt::get(Idx->getType(), 0)); |
| // Since this is a member of the buffer and not the buffer itself, we | ||
| // shouldn't have any offsets we need to contend with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took me a while to figure out what it means :)
| // Since this is a member of the buffer and not the buffer itself, we | |
| // shouldn't have any offsets we need to contend with. | |
| // Since this is a member of struct/class in the buffer and not the struct/class itself, we | |
| // shouldn't have any offsets on the members we need to contend with. |
| llvm::stable_sort(FieldsWithOffset, [](const auto &LHS, const auto &RHS) { | ||
| return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as in one of my comments above, I wonder if we could save some of the work that is not necessary in the common case:
| llvm::stable_sort(FieldsWithOffset, [](const auto &LHS, const auto &RHS) { | |
| return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second); | |
| }); | |
| if (!OffsetInfo.empty()) | |
| llvm::stable_sort(FieldsWithOffset, [](const auto &LHS, const auto &RHS) { | |
| return CGHLSLOffsetInfo::compareOffsets(LHS.second, RHS.second); | |
| }); |
| // TODO: Do we need to diagnose invalid packoffsets here, or will they have | ||
| // already been taken care of? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overlapping packoffsets are already diagnosed here:
llvm-project/clang/lib/Sema/SemaHLSL.cpp
Line 330 in 182c415
| S.Diag(NextVar->getLocation(), diag::err_hlsl_packoffset_overlap) |
This change drops the use of the "Layout" type and instead uses explicit padding throughout the compiler to represent types in HLSL buffers.
There are a few parts to this, though it's difficult to split them up as they're very interdependent:
Fixes several issues, including #138996, #144573, and #156084.
Resolves #147352.