v1.2.3

Latest

Latest

LoserCheems released this 09 Nov 15:55

· 14 commits to main since this release

b746952

What's Changed

Add selectable masking strategies for attention by @LoserCheems in #204
Refactor attention block smoothing for consistency by @LoserCheems in #205
Optimize triton version: GQA, mask/bias broadcasting, skip inactive tiles, and stability fixes by @LoserCheems in #200
[FEATURE SUPPORT] Triton special compact dynamic-mask attention: 1.6× faster fwd+bwd, numerically equivalent by @LoserCheems in #206
Fix documentation and references for Flash Sparse Attention by @LoserCheems in #207

Full Changelog: v1.2.2...v1.2.3

What's Changed

Add selectable masking strategies for attention by @LoserCheems in #204
Refactor attention block smoothing for consistency by @LoserCheems in #205
Optimize triton version: GQA, mask/bias broadcasting, skip inactive tiles, and stability fixes by @LoserCheems in #200
[FEATURE SUPPORT] Triton special compact dynamic-mask attention: 1.6× faster fwd+bwd, numerically equivalent by @LoserCheems in #206
Fix documentation and references for Flash Sparse Attention by @LoserCheems in #207

Full Changelog: v1.2.2...v1.2.3

Contributors

LoserCheems

Assets 67