Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 204 additions & 0 deletions proposals/0034-explicit-padding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
---
title: "[0034] - Explicit Padding in Structs and CBuffer Arrays"
params:
status: Accepted
authors:
- bogner: Justin Bogner
---


## Introduction

We introduce an explicit padding type for HLSL, and construct cbuffers using
this type to unambiguously represent their layout. This will be used for layout
rules implicit to cbuffers (such as struct and array alignment and element
size) as well as for `packoffset` annotations in cbuffers and `vk::offset`
annotations outside of cbuffers.

## Motivation

HLSL has a few contexts where we have types with a layout that doesn't match
the usual rules that follow from C++ definitions and targets' data layouts. We
can generally describe the appropriate type representations for these using
explicit padding, but some care needs to be taken:

- Arrays in CBuffers may have padding in between members, but crucially they do
not have padding after the last member. This needs special handling to
represent.

- There are two HLSL constructs that may introduce arbitrary padding to a
struct. In a cbuffer, the `packoffset` attribute specifies the offset of a
member, and outside of cbuffers, the `vk::offset` vulkan attribute may do the
same.

- Simply padding structures with `i8` as is typical with ABI-related padding
makes it difficult to recover which struct elements are padding vs which are
subobjects. This matters in some backends, and is specifically important for
SPIR-V where we need to map a logical indices into the struct into physical
offsets.

## Proposed solution

Introduce an explicit padding type and use this type for the padding in the
various constructs that need it.

The padding type will be a target type with a parameter for the size in bytes.
For DirectX, this would look like `target("dx.Layout", 4)` for 4 bytes of
padding. For SPIR-V, `target("spirv.Padding", 8)` for 8 bytes of padding. We
may attempt to come up with a first-class type in LLVM for these purposes in
the future.

### Implicit layout rules in a cbuffer

CBuffers in HLSL have very specific layout rules. Scalars and vectors follow
the HLSL/DirectX alignment rules and are aligned as per their scalar size.
Structs and arrays are always aligned on a 16-byte boundary, regardless of
their contents. Furthermore, array elements are each aligned on a 16-byte
boundary.

Since we can't really represent these rules by simply forcing alignments, we
instead use explicit padding between elements to enforce that all arrays and
structs start on 16 byte boundary.

We then emulate the array padding with an array of objects that consist of a
struct containing the element type and padding to 16 bytes, followed by a
single instance of the element type itself. See [CBuffer Padded arrays at the
HLSL-level] for details.

[CBuffer Padded arrays at the HLSL-level]: #cbuffer-padded-arrays-at-the-hlsl-level

### Structs with annotations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably move this section after "### Arrays in a cbuffer" because this is way less common than ararys in cbuffer.


Generally speaking HLSL structs are equivalent to packed structs in C++. We can
simply add padding between members as appropriate in order to satisfy the rules
specified by `packoffset`, `vk::offset`, or otherwise by HLSL semantics.

There is one complicating factor here. Both `dxc` and `fxc` allow `packoffset`
to be used to layout a struct in an order that does not match the
lexicographical order:

```hlsl
cbuffer cb0 {
int x : packoffset(c0.y);
float y : packoffset(c0.x);
}
```

To support this, we need to create the underlying LLVM type in the order that
matches the packoffsets rather than the order as written, so we would end up
with `{ float, i32 }` here, losing the lexicographical order. This is probably
okay since we need to create artificial types for cbuffers anyway (such as when
we filter out resource types that are declared within the cbuffer), but may not
make for a particularly good debugging experience.

## Detailed design

### CBuffer representation at the LLVM level

CBuffers will continue to use a [__cblayout] type, but will no longer use a
`target("dx.Layout", ...)` type.

When using `packoffset`, we'll add explicit padding as necessary. Consider
`cb0`:

```hlsl
cbuffer cb0 : register(b0) {
int x : packoffset(c0.y);
float y : packoffset(c1.z);
}
```

```llvm
%__cblayout_cb0 = type <{
[4 x %pad8],
i32,
[16 x %pad8],
float
}>
```

For structs, we add padding to align them as appropriate:

```hlsl
struct S {
int v;
};
cbuffer cb1 : register(b0) {
int i; // offset 0, size 4 (+12)
S s; // offset 16, size 4
int j; // offset 20, size 4
}

```llvm
%__cblayout_cb1 = type <{
i32, target("dx.Padding", 12),
i32,
i32
}>
```

For arrays, we'll have padding within elements to fill to a 16-byte boundary,
and padding before arrays in order for them to start at 16-byte boundaries.
Consider `cb1`:

```hlsl
cbuffer cb2 : register(b0) {
float a1[3]; // offset 0, size 4 (+12) * 3
double3 a2[2]; // offset 48, size 24 (+8) * 2
uint4 a3[2]; // offset 112, size 16 * 2
float16_t a4[2][2]; // offset 144, size 2 (+14) * 4
}
```

```llvm
%__cblayout_cb2 = type <{
<{ [2 x <{ float, target("dx.Padding", 12) }>], float }>,
target("dx.Padding", 12"),
<{ [1 x <{ <3 x double>, target("dx.Padding", 8) }>], <3 x double> }>,
target("dx.Padding", 8),
[ 2 x <4 x i32> ],
<{ [3 x <{ half, target("dx.Padding", 14) }>], half }>
}>
```

[__cblayout]: https://github.com/llvm/wg-hlsl/blob/4570a9cfc5c4b1e5bc0b773a6fb7b22014ac6d3b/proposals/0016-constant-buffers.md#lowering-constant-buffer-resources-to-llvm-ir "Lowering Constant Buffer Resources to LLVM IR"

Copy link
Member

@hekota hekota Sep 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add example of cbuffer with struct with padding before it.

### CBuffer Padded arrays at the HLSL-level

Arrays in cbuffers need padding between elements if the element size is not a
multiple of 16 bytes. However, we can ignore this at the AST level as the
padding is invisible to all operations representable in HLSL. We already mark
objects in cbuffers via an address space, so nothing needs to change here.

In clang codegen we'll need to recognize that a type is in a cbuffer and
generate struct and array accesses appropriately. By keying off of the address
space, we can ensure that when we lower accesses to LLVM IR we are able to do
so using the padded type logic.

When an object is copied from an object with a cbuffer layout to one with a
standard layout, this goes through codegen logic to emit aggregate copies in
clang. Here we can recognize that the source of the copy is in the cbuffer
address space and break the copy up into elementwise pieces.

## Alternatives considered

See [llvm-project/wg-hlsl#171] for the previous attempt at representing these
types.

Regarding the padding type, we considered the following options:

- A first class LLVM type called `pad8`, which is equivalent but distinct from
`i8`. This would need an RFC to the wider LLVM community and would need to be
useful in other contexts (such as ABI-mandated padding).
- A well-known named type `%pad8`, defined as a named struct containing a
single `i8`. This is the simplest option but requires backends that are
interested in this type to participate in a secret handshake.
- Target types such as `target("dx.pad8")` and `target("spirv.pad8")`. This is
somewhat awkward because the type isn't really tied to a target, but target
types need to be. Targets that don't need to differentiate between padding
and actual members could simply use `i8`.

We settled on a target type with a size parameter for now.

[llvm-project/wg-hlsl#171]: https://github.com/llvm/wg-hlsl/pull/171

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one other alternative to consider is a hybrid, where we create the layout types in the AST, but don't actually have the cbuffer members be of the layout types. That would avoid needing to have special casting behavior for cbuffer types. We could insert the "conversion" code late in CodeGen based of the address space of the pointer being loaded.

I'm not sure if this actually simplifies things or not.

DXC does a bunch of things in CodeGen that shouldn't be done there because it adds data type conversions that actually change values, but in this case these conversions aren't really "type" conversions as much as layout conversions, so I feel less icky about doing them in CodeGen and not fully representing them in the AST.

Curious for thoughts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My uninform thoughts are that it could work. It is worth checking out. Somewhere in clang, we have to handle conversions. I just don't know the best place.

Also note that conversion will have to be done in such a way that they do not cause too much code, and they can be optimized aways. See a recent issue we fixed for SPIR-V: microsoft/DirectXShaderCompiler#7493. Their code copies the entirety of a large cbuffer to return it by value. The expectation is that the optimizer is able copy propagate everything and only load the values that are actually used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can handle the copies directly in clang codegen in EmitAggregateCopy - this already has some special handling for things like ObjC types, so it doesn't seem wrong to do. This is how things are currently working in llvm/llvm-project#156919 and I've updated the proposal.