You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds support for mask and bias tensors with 1, num_heads_k, or num_heads dimensions instead of only num_heads_k.
Enables more flexible attention patterns by allowing masks and biases to be broadcast across different head configurations. Updates parameter passing to track separate head counts for masks and biases, and adds appropriate validation checks.
Temporarily disables variable-length attention variants to focus on core functionality improvements.
0 commit comments