Skip to content

v1.2.3

Latest

Choose a tag to compare

@LoserCheems LoserCheems released this 09 Nov 15:55
· 14 commits to main since this release
b746952

What's Changed

  • Add selectable masking strategies for attention by @LoserCheems in #204
  • Refactor attention block smoothing for consistency by @LoserCheems in #205
  • Optimize triton version: GQA, mask/bias broadcasting, skip inactive tiles, and stability fixes by @LoserCheems in #200
  • [FEATURE SUPPORT] Triton special compact dynamic-mask attention: 1.6× faster fwd+bwd, numerically equivalent by @LoserCheems in #206
  • Fix documentation and references for Flash Sparse Attention by @LoserCheems in #207

Full Changelog: v1.2.2...v1.2.3

What's Changed

  • Add selectable masking strategies for attention by @LoserCheems in #204
  • Refactor attention block smoothing for consistency by @LoserCheems in #205
  • Optimize triton version: GQA, mask/bias broadcasting, skip inactive tiles, and stability fixes by @LoserCheems in #200
  • [FEATURE SUPPORT] Triton special compact dynamic-mask attention: 1.6× faster fwd+bwd, numerically equivalent by @LoserCheems in #206
  • Fix documentation and references for Flash Sparse Attention by @LoserCheems in #207

Full Changelog: v1.2.2...v1.2.3