Skip to content

Commit 6f0b7c1

Browse files
committed
Standardizes mask API; removes ZOH-based path
Reworks dynamic masking to consume precomputed attention bias plus optional boolean causal mask and a window size, using top‑k selection within the window and honoring causality. Removes ZOH/dt_proj/A dependency to simplify masking and reduce coupling. Aligns CUDA, Triton, Flex, and SDPA wrapper to a unified interface, adds GQA support via KV repetition, and ensures consistent tensor layout. Detaches top‑k selection to avoid unintended gradients. Updates benchmarks to generate attention bias and boolean causal masks, renames keep_window_size to window_size, and adjusts configs/loops accordingly for consistent evaluation across backends. Improves clarity, consistency, and extensibility of the attention backward benchmarks.
1 parent 53e1aa4 commit 6f0b7c1

File tree

1 file changed

+217
-267
lines changed

1 file changed

+217
-267
lines changed

0 commit comments

Comments
 (0)