You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updates attention mask and bias documentation for MQA/GQA
Clarifies that attention mask and bias parameters support multiple tensor shapes
to accommodate Multi-Query Attention (MQA) and Grouped Query Attention (GQA)
patterns, in addition to the standard multi-head attention format.
Adds explicit documentation for supported shapes including broadcast-compatible
dimensions for flexible attention implementations.
0 commit comments