Skip to content

v2.6.14

Choose a tag to compare

@DefTruth DefTruth released this 31 Mar 04:56
· 42 commits to main since this release
ea4aa30

What's Changed

  • [feat] add deepseek FlashMLA by @shaoyuyoung in #120
  • Add our ICLR2025 work Dynamic-LLaVA by @Blank-z0 in #121
  • 🔥[MHA2MLA] Towards Economical Inference: Enabling DeepSeek’s Multi-Head Latent Attention in Any Transformer-based LLMs by @DefTruth in #122
  • update the title of SageAttention2 and add SpargeAttn by @jt-zhang in #123
  • Add DeepSeek Open Sources modules by @DefTruth in #124
  • Update DeepSeek/MLA Topics by @DefTruth in #125
  • Request to Add CacheCraft: A Relevant Work on Chunk-Aware KV Cache Reuse by @skejriwal44 in #126
  • 🔥[X-EcoMLA] Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression by @DefTruth in #127
  • Add download_pdfs.py by @DefTruth in #128
  • Update README.md by @DefTruth in #129
  • Update Mooncake-v3 paper link by @DefTruth in #130

New Contributors

Full Changelog: v2.6.13...v2.6.14