Releases · lucidrains/native-sparse-attention-pytorch

20 Feb 16:26

0.0.23

14c90bd

0.0.23

first take care of the block diagonal causal in fine attention

Assets 2

20 Feb 16:16

lucidrains

0.0.22

ae46170

0.0.22

make a decision to deviate from their diagram, where the last token o…

Assets 2

20 Feb 15:55

lucidrains

0.0.21

d18b83d

0.0.21

start accumulating some different compression network ideas

Assets 2

20 Feb 15:31

lucidrains

0.0.20

dc15660

0.0.20

just improvise a solution for compress and selection block sizes not …

Assets 2

20 Feb 15:00

lucidrains

0.0.19

73caf95

0.0.19

make sure deepseek proposal can be compared to attention with gqa

Assets 2

20 Feb 14:51

lucidrains

0.0.18]

1fc2163

0.0.18]

Full Changelog: 0.0.17...0.0.18]

Assets 2

20 Feb 14:51

lucidrains

0.0.18

1fc2163

0.0.18

Full Changelog: 0.0.17...0.0.18

Assets 2

20 Feb 14:30

lucidrains

0.0.17

ba24fbe

0.0.17

Full Changelog: 0.0.16...0.0.17

Assets 2

20 Feb 14:18

lucidrains

0.0.16

6ee39a3

0.0.16

Full Changelog: 0.0.15...0.0.16

Assets 2

20 Feb 13:52

lucidrains

0.0.15

c391705

0.0.15

handle rotary embeddings for sliding windows explicitly

Assets 2

Releases: lucidrains/native-sparse-attention-pytorch

0.0.23

Uh oh!

0.0.22

Uh oh!

0.0.21

Uh oh!

0.0.20

Uh oh!

0.0.19

Uh oh!

0.0.18]

Uh oh!

0.0.18

Uh oh!

0.0.17

Uh oh!

0.0.16

Uh oh!

0.0.15

Uh oh!