https://github.com/epfml/dynamic-sparse-flash-attention/blob/266f36a914216c439f5aa63e12b00fa8489fde95/openwebtext2-experiments/src/models/qksparse_attn.py#L264