Commit 698bc5f
authored
[AMD] Implement Scale Preshuffling and opSel on GFX1250 (#8576)
Following triton-lang/triton#7603, this PR
implemented scale preshuffling on gfx1250 for efficient memory access
and better wmma codegen with `opSel`.
As an example, in a mxfp GEMM kernel with `BLOCK_M x BLOCK_N x BLOCK_K`,
scaleA's shape is `BLOCK_M x (BLOCK_K // 32)`. We preshuffle it to be
`(BLOCK_M // 128) x (BLOCK_K x 4)` outside the kernel for better
vectorization, and 'unshuffle' it inside the kernel to get canonical
input to `wmma_scaled` op. Same to scaleB.
Besides, 16x16x128 scaled wmma instruction reads scales only from the
first 16 lanes in a wave, which is a waste of reading capacity.
Therefore we use `opSel` to control wmma instruction to read scales from
the first or last 16 lanes in a wave. So that we can read scales with
all the lanes in a wave.
To correctly issue wmma instructions with `opSel`, we need to group 2
consecutive wmma instruction tiles in a wave. This is done by
introducing `tilesPerWarp` to `AMDWmmaEncodingAttr`, to avoid composing
linear layout in gluon kernel all the time.
This PR also includes the support for inferring padded shared layout for
MemDescReshapeOp, because in case of async/tensor load, we need to do
the 'unshuffling' on memory subview.1 parent 4e6c423 commit 698bc5f
File tree
12 files changed
+432
-107
lines changed- include/triton/Dialect/TritonGPU/IR
- lib/Dialect/TritonGPU/IR
- python
- src
- triton/experimental/gluon/language/amd
- gfx1250
- third_party/amd
- lib
- TritonAMDGPUToLLVM
- DotOpToLLVM
- TritonAMDGPUTransforms
- python/test
12 files changed
+432
-107
lines changedLines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
125 | 125 | | |
126 | 126 | | |
127 | 127 | | |
128 | | - | |
129 | | - | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
130 | 132 | | |
131 | 133 | | |
132 | 134 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1133 | 1133 | | |
1134 | 1134 | | |
1135 | 1135 | | |
1136 | | - | |
| 1136 | + | |
1137 | 1137 | | |
1138 | 1138 | | |
1139 | 1139 | | |
| |||
1214 | 1214 | | |
1215 | 1215 | | |
1216 | 1216 | | |
| 1217 | + | |
1217 | 1218 | | |
1218 | 1219 | | |
1219 | 1220 | | |
1220 | 1221 | | |
1221 | | - | |
| 1222 | + | |
1222 | 1223 | | |
1223 | 1224 | | |
1224 | 1225 | | |
| |||
1292 | 1293 | | |
1293 | 1294 | | |
1294 | 1295 | | |
| 1296 | + | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
1295 | 1323 | | |
1296 | 1324 | | |
1297 | 1325 | | |
1298 | 1326 | | |
1299 | 1327 | | |
1300 | 1328 | | |
1301 | 1329 | | |
| 1330 | + | |
1302 | 1331 | | |
1303 | 1332 | | |
1304 | 1333 | | |
1305 | 1334 | | |
1306 | 1335 | | |
1307 | 1336 | | |
1308 | 1337 | | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
1309 | 1349 | | |
1310 | 1350 | | |
1311 | 1351 | | |
| |||
1314 | 1354 | | |
1315 | 1355 | | |
1316 | 1356 | | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
1317 | 1360 | | |
1318 | 1361 | | |
1319 | 1362 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1283 | 1283 | | |
1284 | 1284 | | |
1285 | 1285 | | |
| 1286 | + | |
| 1287 | + | |
| 1288 | + | |
1286 | 1289 | | |
1287 | 1290 | | |
1288 | 1291 | | |
| |||
1299 | 1302 | | |
1300 | 1303 | | |
1301 | 1304 | | |
| 1305 | + | |
1302 | 1306 | | |
1303 | 1307 | | |
1304 | 1308 | | |
| |||
1314 | 1318 | | |
1315 | 1319 | | |
1316 | 1320 | | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
| 1324 | + | |
| 1325 | + | |
1317 | 1326 | | |
1318 | 1327 | | |
1319 | 1328 | | |
| |||
1342 | 1351 | | |
1343 | 1352 | | |
1344 | 1353 | | |
1345 | | - | |
1346 | | - | |
1347 | | - | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
1348 | 1360 | | |
1349 | 1361 | | |
1350 | 1362 | | |
| |||
1356 | 1368 | | |
1357 | 1369 | | |
1358 | 1370 | | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
1359 | 1375 | | |
1360 | 1376 | | |
1361 | 1377 | | |
| |||
1365 | 1381 | | |
1366 | 1382 | | |
1367 | 1383 | | |
1368 | | - | |
| 1384 | + | |
| 1385 | + | |
1369 | 1386 | | |
1370 | 1387 | | |
1371 | 1388 | | |
| |||
2172 | 2189 | | |
2173 | 2190 | | |
2174 | 2191 | | |
2175 | | - | |
| 2192 | + | |
2176 | 2193 | | |
2177 | 2194 | | |
2178 | 2195 | | |
| |||
2305 | 2322 | | |
2306 | 2323 | | |
2307 | 2324 | | |
| 2325 | + | |
| 2326 | + | |
2308 | 2327 | | |
2309 | 2328 | | |
2310 | 2329 | | |
| |||
2313 | 2332 | | |
2314 | 2333 | | |
2315 | 2334 | | |
2316 | | - | |
| 2335 | + | |
| 2336 | + | |
| 2337 | + | |
2317 | 2338 | | |
2318 | 2339 | | |
2319 | 2340 | | |
2320 | 2341 | | |
2321 | 2342 | | |
2322 | 2343 | | |
2323 | | - | |
2324 | | - | |
| 2344 | + | |
| 2345 | + | |
| 2346 | + | |
| 2347 | + | |
2325 | 2348 | | |
2326 | 2349 | | |
2327 | 2350 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
772 | 772 | | |
773 | 773 | | |
774 | 774 | | |
| 775 | + | |
775 | 776 | | |
776 | 777 | | |
777 | 778 | | |
| |||
814 | 815 | | |
815 | 816 | | |
816 | 817 | | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
817 | 854 | | |
818 | 855 | | |
819 | 856 | | |
820 | 857 | | |
821 | 858 | | |
822 | 859 | | |
823 | 860 | | |
| 861 | + | |
| 862 | + | |
824 | 863 | | |
825 | 864 | | |
826 | | - | |
827 | | - | |
828 | | - | |
829 | | - | |
830 | | - | |
831 | | - | |
832 | | - | |
833 | | - | |
834 | | - | |
835 | | - | |
836 | | - | |
837 | | - | |
838 | | - | |
839 | | - | |
840 | | - | |
| 865 | + | |
841 | 866 | | |
842 | 867 | | |
843 | 868 | | |
| |||
866 | 891 | | |
867 | 892 | | |
868 | 893 | | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
| 899 | + | |
| 900 | + | |
869 | 901 | | |
870 | 902 | | |
871 | 903 | | |
| |||
883 | 915 | | |
884 | 916 | | |
885 | 917 | | |
886 | | - | |
887 | | - | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
888 | 930 | | |
889 | 931 | | |
890 | 932 | | |
| |||
895 | 937 | | |
896 | 938 | | |
897 | 939 | | |
898 | | - | |
899 | 940 | | |
900 | | - | |
901 | 941 | | |
902 | | - | |
| 942 | + | |
903 | 943 | | |
904 | 944 | | |
905 | 945 | | |
| |||
1428 | 1468 | | |
1429 | 1469 | | |
1430 | 1470 | | |
1431 | | - | |
1432 | | - | |
| 1471 | + | |
| 1472 | + | |
| 1473 | + | |
| 1474 | + | |
1433 | 1475 | | |
1434 | 1476 | | |
1435 | 1477 | | |
| |||
1449 | 1491 | | |
1450 | 1492 | | |
1451 | 1493 | | |
1452 | | - | |
1453 | | - | |
1454 | | - | |
1455 | | - | |
1456 | | - | |
| 1494 | + | |
| 1495 | + | |
1457 | 1496 | | |
1458 | 1497 | | |
1459 | 1498 | | |
1460 | 1499 | | |
1461 | | - | |
1462 | | - | |
1463 | | - | |
| 1500 | + | |
| 1501 | + | |
| 1502 | + | |
| 1503 | + | |
| 1504 | + | |
| 1505 | + | |
| 1506 | + | |
| 1507 | + | |
| 1508 | + | |
| 1509 | + | |
| 1510 | + | |
| 1511 | + | |
| 1512 | + | |
| 1513 | + | |
| 1514 | + | |
| 1515 | + | |
| 1516 | + | |
| 1517 | + | |
1464 | 1518 | | |
1465 | 1519 | | |
1466 | 1520 | | |
| |||
0 commit comments