Skip to content

Commit c8157f0

Browse files
committed
Update the doc in the code.
1 parent 4259f63 commit c8157f0

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -921,7 +921,20 @@ def AMDGPU_TransposeLoadOp :
921921
let summary = "MLIR wrapper for CDNA Transpose Load instructions";
922922
let description = [{
923923
The `amdgpu.transpose_load` op is a wrapper around the `ds_read_tr` instructions.
924-
924+
The transpose load op represents a subgroup load from LDS memory,
925+
where the subgroup of threads collectively reads a matrix from the source
926+
memref, with each thread reading a vector of the matrix, and gets a transposed matrix
927+
in as the result. That is, each thread reads a vector of the col-major matrix at different
928+
indices, and the thread's read result is a vector of the corresponding row of the transposed
929+
matrix.
930+
931+
This op is a direct wrapper around the ROCDL `ds_read_tr` family intrinsics. Please refer
932+
to the ROCDL documentation for more details about its exact semantics.
933+
934+
Format example:
935+
```
936+
%0 = amdgpu.transpose_load %src[%srcIndices] : memref<128x256xf16> -> vector<4xf16>
937+
```
925938
Operands:
926939
* `$src`: LDS memref to read from.
927940
* `$srcIndices`: indices into `$src` to read from for this thread.

0 commit comments

Comments
 (0)