Skip to content

Commit caab92e

Browse files
authored
[HGEMM] Update HGEMM/SGEMM Supported Matrix (#112)
* Update README.md * Update README.md * Update README.md * Update README.md
1 parent 2acf37e commit caab92e

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,21 +13,21 @@
1313

1414
<img width="1438" alt="image" src="https://github.com/user-attachments/assets/0c5e5125-586f-43fa-8e8b-e2c61c1afbbe">
1515

16-
## HGEMM Supported Matrix
16+
## 🎉🎉 HGEMM/SGEMM Supported Matrix
1717

1818
|CUDA Cores|Sliced K(Loop over K)|Tile Block|Tile Thread|
1919
|:---:|:---:|:---:|:---:|
20-
|||||
20+
|✔️|✔️|✔️|✔️|
2121
|**WMMA(m16n16k16)**|**MMA(m16n8k16)**|**Pack LDST**|**SMEM Padding**|
22-
|||||
22+
|✔️|✔️|✔️|✔️|
2323
|**Copy Async**|**Tile MMA(More Threads)**|**Tile Warp(More Values)**|**Multi Stages**|
24-
|||||
24+
|✔️|✔️|✔️|✔️|
2525
|**Reg Double Buffers**|**Block Swizzle**|**Warp Swizzle**|**Collective Store(Shuffle)**|
26-
|||||
27-
|**Row Major(NN)**|**Col Major(TN)**|**SMEM Swizzle**|...|
28-
||||...|
26+
|✔️|✔️|✔️|✔️|
27+
|**Row Major(NN)**|**Col Major(TN)**|**SGEMM TF32**|**SMEM Swizzle**|
28+
|✔️|✔️|✔️||
2929

30-
Welcome to 🌟👆🏻star & submit a PR to this repo to support me!
30+
🎉 Welcome to 🌟👆🏻star & submit a PR to this repo, as it is the simplest way to support me.
3131

3232
## 0x00 📖 CUDA Kernel目录 (面试常考题目)
3333
- / = not supported now.

0 commit comments

Comments
 (0)