Skip to content

Commit 406f673

Browse files
committed
Add varlen MHA fwd/bwd; enable paged KV (no mask)
Re-enables variable-length attention forward/backward and registers both with the extension. Simplifies the varlen API by removing mask/bias; uses empty placeholders and flags, and drops dbias from outputs. Enables paged KV cache for varlen forward, validates left padding, preserves zero_tensors/deterministic handling, and applies minor formatting cleanups.
1 parent 413622f commit 406f673

File tree

1 file changed

+461
-467
lines changed

1 file changed

+461
-467
lines changed

0 commit comments

Comments
 (0)