File tree Expand file tree Collapse file tree 1 file changed +14
-1
lines changed
mlir/include/mlir/Dialect/AMDGPU/IR Expand file tree Collapse file tree 1 file changed +14
-1
lines changed Original file line number Diff line number Diff line change @@ -921,7 +921,20 @@ def AMDGPU_TransposeLoadOp :
921921 let summary = "MLIR wrapper for CDNA Transpose Load instructions";
922922 let description = [{
923923 The `amdgpu.transpose_load` op is a wrapper around the `ds_read_tr` instructions.
924-
924+ The transpose load op represents a subgroup load from LDS memory,
925+ where the subgroup of threads collectively reads a matrix from the source
926+ memref, with each thread reading a vector of the matrix, and gets a transposed matrix
927+ in as the result. That is, each thread reads a vector of the col-major matrix at different
928+ indices, and the thread's read result is a vector of the corresponding row of the transposed
929+ matrix.
930+
931+ This op is a direct wrapper around the ROCDL `ds_read_tr` family intrinsics. Please refer
932+ to the ROCDL documentation for more details about its exact semantics.
933+
934+ Format example:
935+ ```
936+ %0 = amdgpu.transpose_load %src[%srcIndices] : memref<128x256xf16> -> vector<4xf16>
937+ ```
925938 Operands:
926939 * `$src`: LDS memref to read from.
927940 * `$srcIndices`: indices into `$src` to read from for this thread.
You can’t perform that action at this time.
0 commit comments