Skip to content

Commit 96fcae5

Browse files
committed
add more description about how mma matrix is stored
1 parent b691054 commit 96fcae5

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1933,8 +1933,9 @@ def GPU_SubgroupMmaExtractOp : GPU_Op<"subgroup_mma_extract",
19331933

19341934
This operation takes `!gpu.mma_matrix` as its first operand. It is the source
19351935
matrix across a subgroup. The op returns a scalar value stored in the invocation
1936-
in the subgroup. If there are multiple values packed in an invocation, use
1937-
`indices` to specify the element to extract.
1936+
in the subgroup. The values of !gpu.mma_matrix are stored across multiple
1937+
threads in the subgroup. If there are multiple values packed in a thread, use
1938+
`indices` to specify the element in the local thread to extract.
19381939

19391940
Example:
19401941

@@ -1967,7 +1968,8 @@ def GPU_SubgroupMmaInsertOp : GPU_Op<"subgroup_mma_insert",
19671968

19681969
This operation takes scalar value as its first operand and `!gpu.mma_matrix`
19691970
as its second operand. It is the matrix across a subgroup. The op inserts the
1970-
scalar value stored in the invocation in the subgroup to the matrix. If there
1971+
scalar value stored in the invocation in the subgroup to the matrix. The values
1972+
of !gpu.mma_matrix are stored across multiple threads in the subgroup. If there
19711973
are multiple values packed in an invocation, use `indices` to specify the
19721974
location to insert in the packing.
19731975

0 commit comments

Comments
 (0)