File tree Expand file tree Collapse file tree 1 file changed +8
-8
lines changed Expand file tree Collapse file tree 1 file changed +8
-8
lines changed Original file line number Diff line number Diff line change 13
13
14
14
<img width =" 1438 " alt =" image " src =" https://github.com/user-attachments/assets/0c5e5125-586f-43fa-8e8b-e2c61c1afbbe " >
15
15
16
- ## HGEMM Supported Matrix
16
+ ## 🎉🎉 HGEMM/SGEMM Supported Matrix
17
17
18
18
| CUDA Cores| Sliced K(Loop over K)| Tile Block| Tile Thread|
19
19
| :---:| :---:| :---:| :---:|
20
- | ✅ | ✅ | ✅ | ✅ |
20
+ | ✔️ | ✔️ | ✔️ | ✔️ |
21
21
| ** WMMA(m16n16k16)** | ** MMA(m16n8k16)** | ** Pack LDST** | ** SMEM Padding** |
22
- | ✅ | ✅ | ✅ | ✅ |
22
+ | ✔️ | ✔️ | ✔️ | ✔️ |
23
23
| ** Copy Async** | ** Tile MMA(More Threads)** | ** Tile Warp(More Values)** | ** Multi Stages** |
24
- | ✅ | ✅ | ✅ | ✅ |
24
+ | ✔️ | ✔️ | ✔️ | ✔️ |
25
25
| ** Reg Double Buffers** | ** Block Swizzle** | ** Warp Swizzle** | ** Collective Store(Shuffle)** |
26
- | ✅ | ✅ | ✅ | ✅ |
27
- | ** Row Major(NN)** | ** Col Major(TN)** | ** SMEM Swizzle** | ... |
28
- | ✅ | ✅ | ❔ | ... |
26
+ | ✔️ | ✔️ | ✔️ | ✔️ |
27
+ | ** Row Major(NN)** | ** Col Major(TN)** | ** SGEMM TF32 ** | ** SMEM Swizzle** |
28
+ | ✔️ | ✔️ | ✔️ | ❔ |
29
29
30
- Welcome to 🌟👆🏻star & submit a PR to this repo to support me!
30
+ 🎉 Welcome to 🌟👆🏻star & submit a PR to this repo, as it is the simplest way to support me.
31
31
32
32
## 0x00 📖 CUDA Kernel目录 (面试常考题目)
33
33
- / = not supported now.
You can’t perform that action at this time.
0 commit comments