We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 2f33080 commit a65f1f6Copy full SHA for a65f1f6
README.md
@@ -13,7 +13,6 @@
13
14
<img width="1438" alt="image" src="https://github.com/user-attachments/assets/0c5e5125-586f-43fa-8e8b-e2c61c1afbbe">
15
16
-----
17
<h3 align="center">📖 HGEMM/SGEMM Supported Matrix </h3>
18
19
|CUDA Cores|Sliced K(Loop over K)|Tile Block|Tile Thread|
@@ -28,11 +27,8 @@
28
27
|**Row Major(NN)**|**Col Major(TN)**|**SGEMM TF32**|**SMEM Swizzle**|
29
|✔️|✔️|✔️|❔|
30
31
-
32
<p align="center">🎉 Welcome to 🌟👆🏻star & submit a PR to this repo, as it is the simplest way to support me. </p>
33
34
35
36
## 📖 CUDA Kernel目录 (面试常考题目)
37
- / = not supported now.
38
- ✔️ = known work and already supported now.
0 commit comments