Commit bcdf7ad

committed

Refactors attention to explicit bias and mask

Simplifies dynamic masking by accepting precomputed attention bias and an optional causal mask, removing dependence on internal ZOH/dt projection parameters and unifying the API across Python, CUDA, Triton, and Flex backends. Applies masking explicitly via a boolean mask with -inf before softmax and selects a top-k window per query (optionally respecting the causal mask), improving correctness and consistency across implementations. Aligns function signatures, renames keep_window_size to window_size, removes unused return flags, and fixes tensor layouts/contiguity where needed. Updates tests to generate attention bias and derive causal masks, improving forward-equivalence coverage and determinism while reducing coupling to value-state-derived features.

1 parent 71f8631 commit bcdf7adCopy full SHA for bcdf7ad

1 file changed

+163

-223

lines changed

benchmarks
- forward_equivalence.py

1 file changed

+163

-223

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit bcdf7ad

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments