Skip to content

Commit bcdf7ad

Browse files
committed
Refactors attention to explicit bias and mask
Simplifies dynamic masking by accepting precomputed attention bias and an optional causal mask, removing dependence on internal ZOH/dt projection parameters and unifying the API across Python, CUDA, Triton, and Flex backends. Applies masking explicitly via a boolean mask with -inf before softmax and selects a top-k window per query (optionally respecting the causal mask), improving correctness and consistency across implementations. Aligns function signatures, renames keep_window_size to window_size, removes unused return flags, and fixes tensor layouts/contiguity where needed. Updates tests to generate attention bias and derive causal masks, improving forward-equivalence coverage and determinism while reducing coupling to value-state-derived features.
1 parent 71f8631 commit bcdf7ad

File tree

1 file changed

+163
-223
lines changed

1 file changed

+163
-223
lines changed

0 commit comments

Comments
 (0)