Skip to content

Commit d31692c

Browse files
authored
Merge pull request #670 from ROCm/jukorhon/atomic-counter
persistent kernel version of the flash attention forward + FLOP calculation fix when seqlen_q != seqlen_k
2 parents e1245da + b0633a4 commit d31692c

File tree

2 files changed

+430
-286
lines changed

2 files changed

+430
-286
lines changed

python/perf-kernels/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ This script contains the Flash Attention kernel with the following support
4242
- Multi and Grouped Query attention
4343
- ALiBi bias
4444
- Matrix bias
45+
- Persistent kernels. Useful when the sequence lengths are up to a moderate length and especially when doing causal attention.
4546
- Int8 quantization
4647

4748
These are currently supported for the forward kernel only.

0 commit comments

Comments
 (0)