diff --git a/proposals/NNN-dynamic-buffer-objects.md b/proposals/NNN-dynamic-buffer-objects.md new file mode 100644 index 000000000..a5081c95b --- /dev/null +++ b/proposals/NNN-dynamic-buffer-objects.md @@ -0,0 +1,559 @@ +--- +title: "NNNN - Dynamic Buffer Objects" +params: + authors: + - mapodaca-nv: Mike Apodaca + sponsors: + - amarpMSFT: Amar Patel + - llvm-beanz: Chris Bieneman + status: Under Consideration +--- + +## Introduction + +Dynamic Buffer Objects introduces the ability to create buffer objects directly from GPUVAs. This feature extends the +flexibility of resource binding by allowing buffer objects to be created and managed in HLSL shader code, similar to the +approach used in Shader Model 6.6 for descriptors indexed from descriptor heaps. By enabling dynamic buffer object +creation, this feature provides enhanced flexibility and efficiency in resource management for advanced rendering and +compute workloads. + +## Motivation + +Currently, applications face significant limitations in resource management due to the complexity and constraints of +root signatures. Root signatures must be carefully designed to accommodate all potential resource binding scenarios, +leading to complex layouts that are difficult to manage and optimize. The existing API restricts the number of available +root view slots, which can be a significant limitation for complex shaders requiring numerous resources. When root view +slots are exhausted, applications must fall back to using root tables, which add additional overhead and complexity that +is unnecessary for many buffer addressing scenarios. + +This feature addresses these limitations by reducing dependency on root signatures for buffer object creation, similar +to how descriptor heap indexing (introduced in Shader Model 6.6) reduced the need for complex root signature layouts +when accessing descriptor heaps. By enabling dynamic buffer object creation from GPU virtual addresses, applications can +bypass the root signature bottleneck entirely for many use cases, creating buffer objects on-demand within shaders +without requiring pre-planned root signature slots. + +Furthermore, machine learning developers have specifically complained about the lack of ability to access buffers at +arbitrary offsets and cast data to arbitrary structures, capabilities that are readily available in other APIs such as +CUDA. The core limitation is the inability to dynamically create unordered access views for buffers starting at +arbitrary byte offsets with custom element strides, which is a fundamental requirement for many machine learning +algorithms that need to reinterpret buffer data in different layouts. This limitation forces developers to use +workarounds that are less efficient and more complex than necessary, hindering the adoption of DirectX for machine +learning workloads. + +## Proposed solution + +The proposed solution involves extending the HLSL language to support the creation of new buffer objects directly from +GPUVAs. This is accomplished by loading a `uint64_t` from a buffer and using a new `FromAddress` static function to +create the buffer object in HLSL shader code. + +### Example HLSL Compute Shader + +Below is an example of an HLSL compute shader that demonstrates the use of Dynamic Buffer Objects: + +```cpp +// Define a structured buffer containing GPUVAs +StructuredBuffer MyAddressBuffer : register(t0); + +[numthreads(16, 16, 1)] +void main(uint3 DTid : SV_DispatchThreadID) +{ + // Determine which buffers to dynamically load from + // ... + + // Load the GPUVAs from a structured buffer (note: different VA's per thread) + uint64_t startAddress = MyAddressBuffer[NonUniformResourceIndex(resourceIdx)]; + + if (startAddress != 0) + { + // Increment start address past header + startAddress += sizeof(MyHeader); + + // Create (pre-thread) raw buffer objects using the calculated GPUVAs + ByteAddressBuffer MyLocalBuffer = ByteAddressBuffer::FromAddress(startAddress, 4); + + // Load data from the raw buffer objects + uint2 data = MyLocalBuffer.Load(NonUniformResourceIndex(dataIdx)); + + // Perform operations with the loaded data + // ... + } +} +``` + +## Detailed design + +### HLSL Additions + +#### New Static Methods: `FromAddress` + +The `FromAddress` static methods allow the creation of buffer objects directly from GPU virtual addresses. These values +are initially generated by using the `ID3D12Resource::GetGPUVirtualAddress` method. These values may be modified by +shader code prior to being used. These methods are used to dynamically create buffer objects within shaders. + +**Syntax:** + +```cpp +// ByteAddressBuffer creation +ByteAddressBuffer ByteAddressBuffer::FromAddress(uint64_t address, uint32_t alignment); +RWByteAddressBuffer RWByteAddressBuffer::FromAddress(uint64_t address, uint32_t alignment); + +// StructuredBuffer creation +StructuredBuffer StructuredBuffer::FromAddress(uint64_t address, uint32_t alignment); +RWStructuredBuffer RWStructuredBuffer::FromAddress(uint64_t address, uint32_t alignment); + +// ConstantBuffer creation +ConstantBuffer ConstantBuffer::FromAddress(uint64_t address); // 16-byte aligned +``` + +**Description:** + +* Inputs: + * `uint64_t address`: a 64-bit value representing the buffer's starting GPU virtual address. + * `uint32_t alignment`: the minimum alignment of the buffer's starting GPU virtual address. +* Returns: a buffer object created using the specified address and alignment. + +**Programming Rules:** + +* `address` must be a _valid_ GPUVA. If an invalid GPUVA is provided, then the behavior is undefined. +* `address` is allowed to be either uniform or non-uniform. No explicit declarations are required. +* For `ByteAddressBuffer` and `StructuredBuffer` types: + * `alignment` must be a compile-time literal. + * `alignment` must be a power of 2, ">= 4", and "<= 4096". Invalid alignment values will cause compilation errors. + * If the actual alignment of `address` does not match the specified `alignment` value, then the behavior is undefined. +* For `ConstantBuffer` types: + * No `alignment` parameter is required; constant buffers use a fixed 16-byte alignment. + * The `address` must be 16-byte aligned. If not 16-byte aligned, then the behavior is undefined. + +> **Author's note**: The "power of two" requirement comes from the existing DXIL bitfield definition. The minimum +> alignment of `4` maintains the existing requirements for root view GPUVAs. The maximum alignment of `4096` seems +> sufficient but can be as high as `32K` if desired. + +**Additional Notes:** + +* Existing "no bounds checking" rules for root views apply to dynamically created buffer objects. +* Just as "Typed" Buffers cannot be bound as root views, they likewise cannot be created using `FromAddress` methods. +* ConstantBuffer objects created via `FromAddress` follow the same access rules as traditionally bound constant buffers. + +--- + +### Interchange Format Additions + +> **Author's note**: While this proposal is strictly for adding byte address buffer object creation, this feature might +> be extended to other resource objects in the future. Therefore, the DXIL/SPIR-V details will be defined with this +> extensibility in mind. + +#### DXIL Changes + +The `FromAddress` static methods require the following DXIL additions: + +##### New Opcode Definition + +```llvm +; Opcode for FromAddress methods +; Opcode value: 123 (TBD - needs to be assigned from available opcode range) +@dx.op.createBufferFromAddress(i32 123, i64 %address, i32 %alignment, i32 %bufferType) +``` + +##### Function Declaration + +```llvm +; Function Attrs: nounwind +declare %dx.types.Handle @dx.op.createBufferFromAddress( + i32, ; opcode (123) + i64, ; address (GPUVA) + i32, ; alignment + i32 ; buffer type (11=ByteAddressBuffer, 12=StructuredBuffer, 2=ConstantBuffer) +) +``` + +##### Usage Examples + +```llvm +; Example of FromAddress usage in DXIL for ByteAddressBuffer +; %address assumed to be defined earlier (e.g., from structured buffer load) +%bufferHandle = call %dx.types.Handle @dx.op.createBufferFromAddress( + i32 123, ; opcode + i64 %address, ; GPUVA + i32 32, ; alignment (compile-time constant) + i32 11 ; buffer type (ByteAddressBuffer) +) + +; Annotate the handle for subsequent operations +%annotatedHandle = call %dx.types.Handle @dx.op.annotateHandle( + i32 216, ; opcode for annotateHandle + %dx.types.Handle %bufferHandle, + %dx.types.ResourceProperties { + i32 1035, ; ResourceKind = ByteAddressBuffer(11), BaseAlignLog2 = 5 (alignment = 32) + i32 0 ; n/a + } +) + +; Use the buffer for raw buffer operations +%data = call i32 @dx.op.rawBufferLoad.i32( + i32 139, ; opcode for rawBufferLoad + %dx.types.Handle %annotatedHandle, + i32 %offset, ; byte offset + i32 undef, ; element offset (unused) + i8 7, ; mask + i32 4 ; alignment +) +``` + +```llvm +; Example of FromAddress usage in DXIL for ConstantBuffer +%cbufferHandle = call %dx.types.Handle @dx.op.createBufferFromAddress( + i32 123, ; opcode + i64 %cbAddress, ; GPUVA (must be 16-byte aligned) + i32 16, ; alignment (fixed for ConstantBuffer, could be ignored) + i32 2 ; buffer type (ConstantBuffer) +) + +``` + +##### Validation Requirements + +* The alignment parameter must be a compile-time literal +* For ByteAddressBuffer and StructuredBuffer types, the alignment must be a power of 2, ">= 4", and "<= 4096" +* For ConstantBuffer types, the address must be 16-byte aligned +* The address parameter must be a 64-bit integer value +* The bufferType parameter must be a valid buffer type identifier (11 for ByteAddressBuffer, 12 for StructuredBuffer, + 2 for ConstantBuffer) + +#### Metadata Changes + +The `FromAddress` feature requires minimal metadata changes: + +##### Shader Flag Addition + +A new shader flag bit will be added to indicate Dynamic Buffer Objects usage: + +```cpp +// New shader flag for resource object creation from address usage +#define D3D_SHADER_FLAG_USES_RESOURCE_FROM_ADDRESS 0x00080000 +``` + +This flag is set by the compiler when the shader uses the `FromAddress` static methods and can be checked by the runtime +for validation and optimization purposes. + +#### SPIRV Support + +SPIRV support for `FromAddress` methods requires the following additions: + +##### New SPIRV Extension + +```spirv +; Extension declaration +OpExtension "SPV_KHR_resource_from_address" +``` + +##### New SPIRV Capability + +```spirv +; Capability declaration +OpCapability ResourceFromAddress +``` + +##### New SPIRV Instruction + +```spirv +; CreateBufferFromAddress instruction +%result = OpCreateBufferFromAddress %resultType %address %alignment %bufferType +``` + +##### SPIRV Instruction Definition + +* **Opcode**: New opcode for CreateBufferFromAddress (TBD) +* **Operands**: + * `%resultType`: The type of the resulting buffer handle + * `%address`: 64-bit integer containing the GPUVA + * `%alignment`: 32-bit integer containing the alignment requirement + * `%bufferType`: 32-bit integer indicating buffer type (11=ByteAddressBuffer, 12=StructuredBuffer, 2=ConstantBuffer) +* **Result**: Buffer handle that can be used with existing SPIRV buffer operations + +##### SPIRV Translation from DXIL + +```spirv +; Translation of DXIL CreateBufferFromAddress to SPIRV +; DXIL: %bufferHandle = call %dx.types.Handle @dx.op.createBufferFromAddress(i32 123, i64 %address, i32 %alignment, i32 %bufferType) +; SPIRV equivalent: +%bufferHandle = OpCreateBufferFromAddress %HandleType %address %alignment %bufferType +``` + +##### SPIRV Validation Rules + +* The alignment operand must be a constant instruction +* For ByteAddressBuffer and StructuredBuffer types, the alignment value must be a power of 2, ">= 4", and "<= 4096" +* For ConstantBuffer types, the address must be 16-byte aligned +* The address operand must be a 64-bit integer type +* The bufferType operand must be a valid buffer type constant +* The result type must be a buffer handle type + +##### SPIRV Backend Requirements + +* Implement translation from DXIL `@dx.op.createBufferFromAddress` to SPIRV `OpCreateBufferFromAddress` +* Preserve metadata information in SPIRV debug information +* Ensure proper type checking and validation during translation +* Support both uniform and non-uniform address operands + +##### SPIRV Runtime Support + +* Runtime must support the new SPIRV extension and capability +* Hardware drivers must implement the Resource From Address functionality +* Validation layers must check alignment and GPUVA validity +* Performance optimization for common FromAddress usage patterns + +--- + +### Diagnostic Changes + +#### Additional Errors and Warnings + +The `FromAddress` feature introduces the following new diagnostic messages: + +##### Compile-Time Errors + +* **E1234: Invalid alignment value for FromAddress** + * **Trigger**: When `alignment` parameter is not a power of 2 + * **Example**: `ByteAddressBuffer::FromAddress(address, 3)` // 3 is not a power of 2 + * **Message**: "Alignment parameter must be a power of 2, >= 4, and <= 4096" + +* **E1235: Alignment out of range for FromAddress** + * **Trigger**: When `alignment` parameter is less than 4 or greater than 4096 + * **Example**: `ByteAddressBuffer::FromAddress(address, 2)` // 2 < 4 + * **Message**: "Alignment parameter must be >= 4 and <= 4096" + +* **E1236: Non-literal alignment for FromAddress** + * **Trigger**: When `alignment` parameter is not a compile-time literal + * **Example**: `ByteAddressBuffer::FromAddress(address, variableAlignment)` + * **Message**: "Alignment parameter must be a compile-time literal" + +* **E1237: Invalid buffer type for FromAddress** + * **Trigger**: When attempting to create typed buffers with FromAddress + * **Example**: `Buffer buffer = Buffer::FromAddress(address, 4)` // Not supported + * **Message**: "Typed buffers cannot be created using FromAddress. Use ByteAddressBuffer or StructuredBuffer instead" + +* **E1238: FromAddress requires Shader Model X.Y or higher** + * **Trigger**: When using FromAddress with an unsupported shader model + * **Example**: Using FromAddress in a Shader Model 6.0 shader + * **Message**: "FromAddress requires Shader Model X.Y or higher" + +##### Compile-Time Warnings + +* **W1235: Non-uniform FromAddress usage without NonUniformResourceIndex [TBD]** + * **Trigger**: When address parameter is non-uniform but not explicitly marked + * **Example**: Using a non-uniform variable directly as address parameter + * **Message**: "Consider using NonUniformResourceIndex if address parameter varies across threads" + +* **W1236: FromAddress used in potentially divergent control flow** + * **Trigger**: When FromAddress is called in conditional blocks that may diverge + * **Example**: `if (condition) { ByteAddressBuffer::FromAddress(address, 4); }` + * **Message**: "FromAddress in divergent control flow may impact performance" + +##### GBV Runtime Warnings + +* **W1237: Invalid GPUVA provided to FromAddress [GBV]** + * **Trigger**: When runtime validation detects an invalid GPUVA + * **Example**: GPUVA points to unmapped memory or invalid resource + * **Message**: "Invalid GPUVA provided to FromAddress - undefined behavior may occur" + +* **W1238: GPUVA alignment mismatch in FromAddress [GBV]** + * **Trigger**: When GPUVA is not aligned to the specified alignment value + * **Example**: GPUVA is 0x1001 but alignment is 4 + * **Message**: "GPUVA alignment does not match specified alignment - undefined behavior may occur" + +* **W1239: ConstantBuffer address not 16-byte aligned [GBV]** + * **Trigger**: When ConstantBuffer GPUVA is not 16-byte aligned + * **Example**: ConstantBuffer GPUVA is 0x1008 (not 16-byte aligned) + * **Message**: "ConstantBuffer GPUVA must be 16-byte aligned - undefined behavior may occur" + +#### Existing Errors and Warnings Removed + +The `FromAddress` feature does not remove any existing diagnostic messages, but it may modify the context or +applicability of some existing warnings: + +##### Modified Existing Warnings + +* **W1001: Non-uniform resource indexing (Modified Context) [TBD]** + * **Previous**: Warned about non-uniform indexing into descriptor heaps + * **Modified**: Now also applies to non-uniform indexing into buffers containing GPUVAs + * **Example**: `MyAddressBuffer[NonUniformResourceIndex(index)]` without NonUniformResourceIndex + * **Updated Message**: "Non-uniform indexing detected. Consider using NonUniformResourceIndex for resource heap or + GPUVA buffer access" + +* **W1002: Potential bounds violation (Modified Context)** + * **Previous**: Warned about potential out-of-bounds access to resources + * **Modified**: Now also applies to potential out-of-bounds access to buffers created via FromAddress + * **Example**: Accessing beyond the bounds of a buffer created with FromAddress + * **Updated Message**: "Potential bounds violation detected. Ensure resource access is within valid range" + +##### Context Extensions + +* **E1001: Invalid resource binding (Extended Scope)** + * **Previous**: Applied only to traditional resource binding + * **Extended**: Now also applies to FromAddress buffer creation + * **Example**: Attempting to use FromAddress with unsupported buffer types + * **Extended Message**: "Invalid resource binding or FromAddress usage detected" + +* **W1003: Performance warning for divergent resource access (Extended Scope)** + * **Previous**: Warned about divergent access to traditional resources + * **Extended**: Now also applies to divergent FromAddress usage + * **Example**: FromAddress called with divergent parameters + * **Extended Message**: "Divergent resource access detected. Consider uniform resource access patterns for better + performance" + +#### Validation Changes + +##### Additional Validation Failures + +The `FromAddress` feature introduces the following new validation failures: + +* **V1234: Missing Resource From Address metadata** + * **Trigger**: When DXIL/SPIRV validation detects missing Resource From Address metadata + * **Example**: Shader uses FromAddress but lacks `D3D_SHADER_FLAG_USES_RESOURCE_FROM_ADDRESS` flag + * **Validation**: DXIL/SPIRV validation layer checks for required metadata presence + +* **V1235: Invalid Resource From Address metadata format** + * **Trigger**: When Resource From Address metadata has incorrect format or values + * **Example**: Shader flag is set but no FromAddress calls are present + * **Validation**: Metadata structure and field value validation + +* **V1236: FromAddress opcode not supported** + * **Trigger**: When runtime encounters FromAddress opcode without support + * **Example**: Using FromAddress on hardware that doesn't support the feature + * **Validation**: Hardware capability checking during shader execution + +* **V1237: Invalid FromAddress alignment validation** + * **Trigger**: When alignment parameter violates compile-time or runtime constraints + * **Example**: Alignment value not a power of 2, less than 4, or greater than 4096 for + ByteAddressBuffer/StructuredBuffer; or ConstantBuffer address not 16-byte aligned + * **Validation**: Compile-time literal validation and runtime alignment checking + +* **V1238: FromAddress GPUVA validation failure** + * **Trigger**: When GPUVA is invalid or points to inaccessible memory + * **Example**: GPUVA is null, unmapped, or points to invalid resource + * **Validation**: Runtime GPUVA validity checking + +##### Existing Validation Failures Removed + +The `FromAddress` feature does not remove any existing validation failures, but it may modify the scope or context of +some existing validations: + +##### Modified Existing Validations + +* **V1001: Resource binding validation (Extended Scope)** + * **Previous**: Validated only traditional resource binding patterns + * **Modified**: Now also validates FromAddress buffer creation patterns + * **Example**: Validating that FromAddress creates valid buffer types + * **Extended Validation**: Resource type compatibility checking for FromAddress + +* **V1002: Resource access validation (Extended Scope)** + * **Previous**: Validated access to traditionally bound resources + * **Modified**: Now also validates access to resources created via FromAddress + * **Example**: Bounds checking for buffers created with FromAddress + * **Extended Validation**: Access pattern validation for FromAddress-created resources + +* **V1003: Uniformity validation (Extended Scope)** + * **Previous**: Validated uniformity of traditional resource access + * **Modified**: Now also validates uniformity of FromAddress usage + * **Example**: Checking for proper NonUniformResourceIndex usage with FromAddress + * **Extended Validation**: Uniformity analysis for FromAddress parameters and usage + +--- + +### Runtime Additions + +#### Runtime Information + +The compiler must provide the following information to the runtime for proper `FromAddress` support: + +##### Compiler Requirements + +* **Resource From Address Usage Flag**: The compiler must set the `D3D_SHADER_FLAG_USES_RESOURCE_FROM_ADDRESS` shader + flag to indicate that the shader uses the `FromAddress` static methods + * **Format**: Standard shader flag bit (0x00080000) + * **Runtime Usage**: Runtime uses this flag to determine if Resource From Address validation and processing is + required + +* **CreateBufferFromAddress Opcode Information**: The compiler must include the CreateBufferFromAddress opcode (123) in + the shader's opcode list + * **Format**: Standard DXIL opcode metadata + * **Runtime Usage**: Runtime uses this to identify and process CreateBufferFromAddress instructions during shader + execution + +##### Runtime Validation Information + +* **Alignment Validation**: Compiler performs compile-time validation of alignment parameters against the fixed + constraints (power of 2, >= 4, and <= 4096 for ByteAddressBuffer/StructuredBuffer; 16-byte alignment for + ConstantBuffer) + * **Runtime Usage**: Runtime can rely on compile-time validation and doesn't need to re-validate alignment constraints + +* **Resource Type Compatibility**: Compiler validates that only supported buffer types (ByteAddressBuffer, + StructuredBuffer, and ConstantBuffer) are created via FromAddress + * **Runtime Usage**: Runtime can assume all FromAddress calls create valid buffer types + +##### SPIRV Translation Information + +* **SPIRV Extension Requirements**: Compiler must declare the `SPV_KHR_resource_from_address` extension when translating + to SPIRV + * **Format**: `OpExtension "SPV_KHR_resource_from_address"` + * **Runtime Usage**: SPIRV runtime uses this to enable Resource From Address support + +* **SPIRV Capability Declaration**: Compiler must declare the `ResourceFromAddress` capability + * **Format**: `OpCapability ResourceFromAddress` + * **Runtime Usage**: SPIRV runtime uses this to verify hardware support + +#### Device Capability + +##### Shader Model Interaction + +* **Shader Model 6.8 Prerequisite**: The bulk of the Dynamic Buffer Objects feature requires Shader Model 6.8 or higher + * **Rationale**: Dynamic Buffer Objects builds upon existing root view infrastructure introduced in earlier shader + models + * **Dependency**: Requires the underlying root view system and GPUVA management capabilities + +* **Interaction with Shader Model 6.6**: Dynamic Buffer Objects complements the descriptor heap indexing features + introduced in Shader Model 6.6 + * **Synergy**: Both features provide dynamic resource access, but Dynamic Buffer Objects operates at a lower level + * **Coexistence**: Shaders can use both descriptor heap indexing and Dynamic Buffer Objects simultaneously + +* **Backward Compatibility**: Dynamic Buffer Objects does not interfere with existing root view functionality in older + shader models + * **Isolation**: Traditional root views continue to work as before + * **No Breaking Changes**: Existing shaders remain unaffected + +##### Emulation and Fallback Support + +* **No Emulation Below Shader Model X.Y**: Dynamic Buffer Objects cannot be emulated in older shader models + * **Rationale**: Lacks fundamental infrastructure for dynamic resource binding + * **Fallback**: Compiler must generate error messages for unsupported shader models + * **User Guidance**: Developers must target appropriate shader model or use alternative approaches + +##### Hardware Capability Requirements + +* **Memory Alignment Support**: Hardware must support the specified alignment requirements for buffer objects + * **Requirement**: Hardware must handle memory access with the requested alignment (power of 2, >= 4, and <= 4096) + * **Validation**: Runtime validates alignment support against hardware capabilities + +* **Dynamic Resource Binding**: Hardware must support dynamic creation of resource views + * **Requirement**: Hardware must be able to create buffer views from GPUVA at runtime + * **Validation**: Runtime tests dynamic resource binding capabilities during device creation + +## Testing + +Codegen correctness for both DXIL and SPIRV should be validated through DXC unit-level tests that verify proper +translation of `FromAddress` static methods to the corresponding `createBufferFromAddress` opcodes with correct metadata +generation. + +Diagnostic validation should include comprehensive testing of all alignment constraint violations, invalid buffer type +usage, and shader model compatibility checks through automated compiler test suites. + +An HLK test should verify that Dynamic Buffer Objects perform memory reads and writes to the correct GPU virtual +addresses across all supported buffer types, alignment values, and uniformity scenarios, ensuring hardware-level +functionality matches the specification requirements for all object creation patterns. + +## Acknowledgments (Optional) + +* Anupama Chandrasekhar (NVIDIA) +* Justin Holewinski (NVIDIA) +* Tex Riddell (Microsoft) +* Amar Patel (Microsoft)