Skip to content

Commit d4e9982

Browse files
authored
[AMDGPU] Document meaning of alignment of buffer fat pointers, intrinsics (#167553)
This commit adds documentation clarifying the meaning of `align` on ptr addrpsace(7) (buffer fat pointer) and ptr addrspace(9) (bufferef structured pointer) operations (specifying that both the base and the offset need to be aligned) and documents the meaning of the `align` attribute when used as an argument on *.buffer.ptr.* intrinsics.
1 parent c764ee6 commit d4e9982

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

llvm/docs/AMDGPUUsage.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1016,6 +1016,15 @@ supported for the ``amdgcn`` target.
10161016
`ptr addrspace(7)` directly, which produces a buffer fat pointer with an initial
10171017
offset of 0 and prevents the address space cast from being rewritten away.
10181018

1019+
The ``align`` attribute on operations from buffer fat pointers is deemed to apply
1020+
to all componenents of the pointer - that is, an ``align 4`` load is expected to
1021+
both have the offset be a multiple of 4 and to have a base pointer with an
1022+
alignment of 4.
1023+
1024+
This componentwise definition of alignment is needed to allow for promotion of
1025+
aligned loads to ``s_buffer_load``, which requires that both the base pointer and
1026+
offset be appropriately aligned.
1027+
10191028
**Buffer Resource**
10201029
The buffer resource pointer, in address space 8, is the newer form
10211030
for representing buffer descriptors in AMDGPU IR, replacing their
@@ -1039,6 +1048,25 @@ supported for the ``amdgcn`` target.
10391048
(bits `127:96`). The specific interpretation of these fields varies by the
10401049
target architecture and is detailed in the ISA descriptions.
10411050

1051+
When buffer resources are passed to buffer intrinsics such as
1052+
``llvm.amdgcn.raw.ptr.buffer.load`` or
1053+
``llvm.amdgcn.struct.ptr.buffer.store``, the ``align`` attribute on the
1054+
pointer is assumed to apply to both the offset and the base pointer value.
1055+
That is, ``align 8`` means that both the base address within the ``ptr
1056+
addrspace(8)`` and the ``offset`` argument have their three lowest bits set
1057+
to 0. If the stride of the resource is nonzero, the stride must be a multiple
1058+
of the given alignment.
1059+
1060+
In other words, the ``align`` attribute specifies the alignment of the effective
1061+
address being loaded from/stored to *and* acts as a guarantee that this is
1062+
not achieved from adding lower-alignment parts (as hardware may not always
1063+
allow for such an addition). For example, if a buffer resource has the base
1064+
address ``0xfffe`` and is accessed with a ``raw.ptr.buffer.load`` with an offset
1065+
of ``2``, the load must **not** be marked ``align 4`` (even though the
1066+
effective adddress ``0x10000`` is so aligned) as this would permit the compiler
1067+
to make incorrect transformations (such as promotion to ``s_buffer_load``,
1068+
which requires such componentwise alignment).
1069+
10421070
**Buffer Strided Pointer**
10431071
The buffer index pointer is an experimental address space. It represents
10441072
a 128-bit buffer descriptor and a 32-bit offset, like the **Buffer Fat
@@ -1057,6 +1085,12 @@ supported for the ``amdgcn`` target.
10571085
index and offset values are both 0. This prevents the address space cast from
10581086
being rewritten away.
10591087

1088+
As with buffer fat pointers, alignment of a buffer strided pointer applies to
1089+
both the base pointer address and the offset. In addition, the alignment also
1090+
constrains the stride of the pointer. That is, if you do an ``align 4`` load from
1091+
a buffer strided pointer, this means that the base pointer is ``align(4)``, that
1092+
the offset is a multiple of 4 bytes, and that the stride is a multiple of 4.
1093+
10601094
**Streamout Registers**
10611095
Dedicated registers used by the GS NGG Streamout Instructions. The register
10621096
file is modelled as a memory in a distinct address space because it is indexed

0 commit comments

Comments
 (0)