Skip to content

Commit 7e22265

Browse files
authored
[FA2] tiling-qkv + MMA F32F16F16F32✔️ (#210)
* Update flash_attn_mma.py * Update README.md
1 parent e43172b commit 7e22265

File tree

2 files changed

+7
-1
lines changed

2 files changed

+7
-1
lines changed

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,12 @@ The kernels listed here will guide you through a step-by-step progression, rangi
205205
| ✔️ [hardswish_f16x2](./kernels/hardswish/hardswish.cu)|f16|/|[link](./kernels/hardswish/)|⭐️|
206206
| ✔️ [hardswish_f16x8](./kernels/hardswish/hardswish.cu)|f16|/|[link](./kernels/hardswish/)|⭐️|
207207
| ✔️ [hardswish_f16x8_pack](./kernels/hardswish/hardswish.cu)|f16|/|[link](./kernels/hardswish/)|⭐️⭐️|
208+
| ✔️ [hardshrink_f32](./kernels/hardshrink/hardshrink.cu)|f32|/|[link](./kernels/hardshrink/)|⭐️|
209+
| ✔️ [hardshrink_f32x4](./kernels/hardshrink/hardshrink.cu)|f32|/|[link](./kernels/hardshrink/)|⭐️|
210+
| ✔️ [hardshrink_f16](./kernels/hardshrink/hardshrink.cu)|f16|/|[link](./kernels/hardshrink/)|⭐️|
211+
| ✔️ [hardshrink_f16x2](./kernels/hardshrink/hardshrink.cu)|f16|/|[link](./kernels/hardshrink/)|⭐️|
212+
| ✔️ [hardshrink_f16x8](./kernels/hardshrink/hardshrink.cu)|f16|/|[link](./kernels/hardshrink/)|⭐️|
213+
| ✔️ [hardshrink_f16x8_pack](./kernels/hardshrink/hardshrink.cu)|f16|/|[link](./kernels/hardshrink/)|⭐️⭐️|
208214
| ✔️ [embedding_f32](./kernels/embedding/embedding.cu)|f32|/|[link](./kernels/embedding/)|⭐️|
209215
| ✔️ [embedding_f32x4](./kernels/embedding/embedding.cu)|f32|/|[link](./kernels/embedding/)|⭐️|
210216
| ✔️ [embedding_f32x4_pack](./kernels/embedding/embedding.cu)|f32|/|[link](./kernels/embedding/)|⭐️|

kernels/flash-attn/flash_attn_mma.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -250,7 +250,7 @@ def run_benchmark(perf_func: callable,
250250
return None, None
251251

252252
if not args.build_others:
253-
others_tags = ["s2g-o", "rr"]
253+
others_tags = ["s2g", "rr"]
254254
for o_tag in others_tags:
255255
if o_tag in tag:
256256
return None, None

0 commit comments

Comments
 (0)