File tree Expand file tree Collapse file tree 1 file changed +0
-11
lines changed
mlir/include/mlir/Dialect/AMDGPU/IR Expand file tree Collapse file tree 1 file changed +0
-11
lines changed Original file line number Diff line number Diff line change @@ -1285,17 +1285,6 @@ def AMDGPU_ScaledWMMAOp
12851285 first_scale_lane of 0 or 16 will decide which lanes are used for this. When
12861286 num_scales / scales_per_lane == 64 (num_lanes), then first_scale_lane must
12871287 be set to 0.
1288-
1289- For tile size 16x16x128, each matrix gets 64 scales stored
1290- 16 lanes, with `a_first_scale_lane`/`b_first_scale_lane` selecting lanes
1291- 0-15 (index=0) or lanes 16-31 (index=16). For a tile size of 32x16x128,
1292- matrix A gets 128 scales in a full VGPR (`a_first_scale_lane` is unused),
1293- while matrix B gets 64 scales in half a VGPR.
1294- - Block size 16: For a tile size of 16x16x128, each matrix gets
1295- 128 scales stored in half of two VGPRs, with `a_first_scale_lane`/`b_first_scale_lane`
1296- selecting lanes 0-15 (index=0) or 16-31 (index=1) for each of the VGPRs.
1297- For 32x16x128, matrix A gets 256 scales in two VGPRs (`a_first_scale_lane` is unused),
1298- while matrix B gets 128 scales stored in half of two VGPRs.
12991288
13001289 Example:
13011290 ```mlir
You can’t perform that action at this time.
0 commit comments