Skip to content

Commit bf26a68

Browse files
committed
Adds GQA forward and boolean mask/bias
Adds forward support for GQA/MQA (different Q vs KV heads) with optional boolean mask and bias, including broadcasting across batch/head/seq dims and per-head routing. Switches to compile-time mask/bias flags, removes the scratchpad workaround, simplifies scaling, and indexes LSE/Out by Q heads. Skips masked-out tiles, tightens mask semantics (True = keep), and fixes backward mask handling. Introduces a contiguity helper, bumps pipeline stages, and errors out on Triton backward for GQA/MQA until implemented.
1 parent 424b733 commit bf26a68

File tree

1 file changed

+219
-177
lines changed

1 file changed

+219
-177
lines changed

0 commit comments

Comments
 (0)