Commit fcfa1a5
committed
[FA] Add tl.assume to flash_attention.py
attn_fwd kernel can use buffer ops.
Doesn't give any noticeable boost but maybe helpful to newer arch.1 parent eb7e015 commit fcfa1a5
1 file changed
+17
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
443 | 443 | | |
444 | 444 | | |
445 | 445 | | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
446 | 463 | | |
447 | 464 | | |
448 | 465 | | |
| |||
0 commit comments