-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
Hi,
Is head dim 256 supported in flexible_flash_attention/magi_attention?
I noticed that get_max_headdim() could be 256 if FLASHATTENTION_DISABLE_HDIM256 is not defined as shown here.
However, when I actually run the attention with 256 head dim inputs, I hit this sanity_check failure (code here):
assert head_dim <= 128, "head_dim must be <= 128 for now"
# Sample code:
out_, lse_ = flex_flash_attn_func(
q_[i],
k_[i],
v_[i],
q_ranges,
k_ranges,
attn_type_map=attn_type_map,
softmax_scale=softmax_scale,
disable_fwd_atomic_reduction=False,
)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels