Releases: lucidrains/native-sparse-attention-pytorch
Releases · lucidrains/native-sparse-attention-pytorch
0.0.34
oops
0.0.33
make the fine flex block mask also aware of gqa
0.0.31
oops, think they pair up the query heads to kv heads in gqa differently
0.0.30
use enable_gqa for flex attention for the sliding windows branch
0.0.29
wire up flex fine selected attention and make sure it runs
0.0.28
Full Changelog: 0.0.27...0.0.28
0.0.27
Full Changelog: 0.0.26...0.0.27
0.0.26
Full Changelog: 0.0.24...0.0.26
0.0.25
last commit for the day, should be ready for experiments tomorrow
0.0.24
complete the fine attention masking with flex attention, not wired up