Skip to content

Commit 26342e7

Browse files
krzysz00github-actions[bot]
authored andcommitted
Automerge: [AMDGPU] Update buffer fat pointer docs for gfx1250, fix formatting (#167818)
2 parents 63091f5 + 0190951 commit 26342e7

File tree

1 file changed

+19
-12
lines changed

1 file changed

+19
-12
lines changed

llvm/docs/AMDGPUUsage.rst

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1011,9 +1011,9 @@ supported for the ``amdgcn`` target.
10111011
bounds checking may be disabled, buffer fat pointers may choose to enable
10121012
it or not). The cache swizzle support introduced in gfx942 may be used.
10131013

1014-
These pointers can be created by `addrspacecast` from a buffer resource
1015-
(`ptr addrspace(8)`) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
1016-
`ptr addrspace(7)` directly, which produces a buffer fat pointer with an initial
1014+
These pointers can be created by ``addrspacecast`` from a buffer resource
1015+
(``ptr addrspace(8)```) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
1016+
``ptr addrspace(7)`` directly, which produces a buffer fat pointer with an initial
10171017
offset of 0 and prevents the address space cast from being rewritten away.
10181018

10191019
The ``align`` attribute on operations from buffer fat pointers is deemed to apply
@@ -1028,26 +1028,33 @@ supported for the ``amdgcn`` target.
10281028
**Buffer Resource**
10291029
The buffer resource pointer, in address space 8, is the newer form
10301030
for representing buffer descriptors in AMDGPU IR, replacing their
1031-
previous representation as `<4 x i32>`. It is a non-integral pointer
1032-
that represents a 128-bit buffer descriptor resource (`V#`).
1031+
previous representation as ``<4 x i32>``. It is a non-integral pointer
1032+
that represents a 128-bit buffer descriptor resource (``V#``).
10331033

10341034
Since, in general, a buffer resource supports complex addressing modes that cannot
10351035
be easily represented in LLVM (such as implicit swizzled access to structured
1036-
buffers), it is **illegal** to perform non-trivial address computations, such as
1037-
``getelementptr`` operations, on buffer resources. They may be passed to
1038-
AMDGPU buffer intrinsics, and they may be converted to and from ``i128``.
1036+
buffers), performing address computations such as ``getelementptr`` is not
1037+
recommended on ``ptr addrspace(8)``s (if such computations are performed, the
1038+
offset must be wavefront-uniform.) Note that such a usage of GEP is currently
1039+
**unimplemented** in the backend, as it would require a wrapping 48-bit
1040+
addition. Buffer resources may be passed to AMDGPU buffer intrinsics, and they
1041+
may be converted to and from ``i128``.
10391042

10401043
Casting a buffer resource to a buffer fat pointer is permitted and adds an offset
10411044
of 0.
10421045

10431046
Buffer resources can be created from 64-bit pointers (which should be either
1044-
generic or global) using the `llvm.amdgcn.make.buffer.rsrc` intrinsic, which
1047+
generic or global) using the ``llvm.amdgcn.make.buffer.rsrc`` intrinsic, which
10451048
takes the pointer, which becomes the base of the resource,
10461049
the 16-bit stride (and swzizzle control) field stored in bits `63:48` of a `V#`,
10471050
the 32-bit NumRecords/extent field (bits `95:64`), and the 32-bit flags field
10481051
(bits `127:96`). The specific interpretation of these fields varies by the
10491052
target architecture and is detailed in the ISA descriptions.
10501053

1054+
On gfx1250, the base pointer is instead truncated to 57 bits and the NumRecords
1055+
field is 45 bits, which necessitated a change to ``make.buffer.rsrcs``'s arguments
1056+
in order to make that field an ``i64``.
1057+
10511058
When buffer resources are passed to buffer intrinsics such as
10521059
``llvm.amdgcn.raw.ptr.buffer.load`` or
10531060
``llvm.amdgcn.struct.ptr.buffer.store``, the ``align`` attribute on the
@@ -1079,9 +1086,9 @@ supported for the ``amdgcn`` target.
10791086
the stride is the size of a structured element, the "add tid" flag must be 0,
10801087
and the swizzle enable bits must be off.
10811088

1082-
These pointers can be created by `addrspacecast` from a buffer resource
1083-
(`ptr addrspace(8)`) or by using `llvm.amdgcn.make.buffer.rsrc` to produce a
1084-
`ptr addrspace(9)` directly, which produces a buffer strided pointer whose initial
1089+
These pointers can be created by ``addrspacecast`` from a buffer resource
1090+
(``ptr addrspace(8)``) or by using ``llvm.amdgcn.make.buffer.rsrc`` to produce a
1091+
``ptr addrspace(9)``` directly, which produces a buffer strided pointer whose initial
10851092
index and offset values are both 0. This prevents the address space cast from
10861093
being rewritten away.
10871094

0 commit comments

Comments
 (0)