Skip to content

Releases: lucidrains/native-sparse-attention-pytorch

0.0.23

20 Feb 16:26

Choose a tag to compare

first take care of the block diagonal causal in fine attention

0.0.22

20 Feb 16:16

Choose a tag to compare

make a decision to deviate from their diagram, where the last token o…

0.0.21

20 Feb 15:55

Choose a tag to compare

start accumulating some different compression network ideas

0.0.20

20 Feb 15:31

Choose a tag to compare

just improvise a solution for compress and selection block sizes not …

0.0.19

20 Feb 15:00

Choose a tag to compare

make sure deepseek proposal can be compared to attention with gqa

0.0.18]

20 Feb 14:51

Choose a tag to compare

Full Changelog: 0.0.17...0.0.18]

0.0.18

20 Feb 14:51

Choose a tag to compare

Full Changelog: 0.0.17...0.0.18

0.0.17

20 Feb 14:30

Choose a tag to compare

Full Changelog: 0.0.16...0.0.17

0.0.16

20 Feb 14:18

Choose a tag to compare

Full Changelog: 0.0.15...0.0.16

0.0.15

20 Feb 13:52

Choose a tag to compare

handle rotary embeddings for sliding windows explicitly