@@ -1011,9 +1011,9 @@ supported for the ``amdgcn`` target.
10111011 bounds checking may be disabled, buffer fat pointers may choose to enable
10121012 it or not). The cache swizzle support introduced in gfx942 may be used.
10131013
1014- These pointers can be created by `addrspacecast` from a buffer resource
1015- (`ptr addrspace(8)`) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
1016- `ptr addrspace(7)` directly, which produces a buffer fat pointer with an initial
1014+ These pointers can be created by `` addrspacecast` ` from a buffer resource
1015+ (`` ptr addrspace(8)`` `) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
1016+ `` ptr addrspace(7)` ` directly, which produces a buffer fat pointer with an initial
10171017 offset of 0 and prevents the address space cast from being rewritten away.
10181018
10191019 The ``align`` attribute on operations from buffer fat pointers is deemed to apply
@@ -1028,26 +1028,33 @@ supported for the ``amdgcn`` target.
10281028**Buffer Resource**
10291029 The buffer resource pointer, in address space 8, is the newer form
10301030 for representing buffer descriptors in AMDGPU IR, replacing their
1031- previous representation as `<4 x i32>`. It is a non-integral pointer
1032- that represents a 128-bit buffer descriptor resource (`V# `).
1031+ previous representation as `` <4 x i32>` `. It is a non-integral pointer
1032+ that represents a 128-bit buffer descriptor resource (``V#` `).
10331033
10341034 Since, in general, a buffer resource supports complex addressing modes that cannot
10351035 be easily represented in LLVM (such as implicit swizzled access to structured
1036- buffers), it is **illegal** to perform non-trivial address computations, such as
1037- ``getelementptr`` operations, on buffer resources. They may be passed to
1038- AMDGPU buffer intrinsics, and they may be converted to and from ``i128``.
1036+ buffers), performing address computations such as ``getelementptr`` is not
1037+ recommended on ``ptr addrspace(8)``s (if such computations are performed, the
1038+ offset must be wavefront-uniform.) Note that such a usage of GEP is currently
1039+ **unimplemented** in the backend, as it would require a wrapping 48-bit
1040+ addition. Buffer resources may be passed to AMDGPU buffer intrinsics, and they
1041+ may be converted to and from ``i128``.
10391042
10401043 Casting a buffer resource to a buffer fat pointer is permitted and adds an offset
10411044 of 0.
10421045
10431046 Buffer resources can be created from 64-bit pointers (which should be either
1044- generic or global) using the `llvm.amdgcn.make.buffer.rsrc` intrinsic, which
1047+ generic or global) using the `` llvm.amdgcn.make.buffer.rsrc` ` intrinsic, which
10451048 takes the pointer, which becomes the base of the resource,
10461049 the 16-bit stride (and swzizzle control) field stored in bits `63:48` of a `V#`,
10471050 the 32-bit NumRecords/extent field (bits `95:64`), and the 32-bit flags field
10481051 (bits `127:96`). The specific interpretation of these fields varies by the
10491052 target architecture and is detailed in the ISA descriptions.
10501053
1054+ On gfx1250, the base pointer is instead truncated to 57 bits and the NumRecords
1055+ field is 45 bits, which necessitated a change to ``make.buffer.rsrcs``'s arguments
1056+ in order to make that field an ``i64``.
1057+
10511058 When buffer resources are passed to buffer intrinsics such as
10521059 ``llvm.amdgcn.raw.ptr.buffer.load`` or
10531060 ``llvm.amdgcn.struct.ptr.buffer.store``, the ``align`` attribute on the
@@ -1079,9 +1086,9 @@ supported for the ``amdgcn`` target.
10791086 the stride is the size of a structured element, the "add tid" flag must be 0,
10801087 and the swizzle enable bits must be off.
10811088
1082- These pointers can be created by `addrspacecast` from a buffer resource
1083- (`ptr addrspace(8)`) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
1084- `ptr addrspace(9)` directly, which produces a buffer strided pointer whose initial
1089+ These pointers can be created by `` addrspacecast` ` from a buffer resource
1090+ (`` ptr addrspace(8)`` ) or by using `` llvm.amdgcn.make.buffer.rsrc` ` to produce a
1091+ `` ptr addrspace(9)`` ` directly, which produces a buffer strided pointer whose initial
10851092 index and offset values are both 0. This prevents the address space cast from
10861093 being rewritten away.
10871094
0 commit comments