v2.6.14

DefTruth released this 31 Mar 04:56

· 42 commits to main since this release

ea4aa30

What's Changed

[feat] add deepseek FlashMLA by @shaoyuyoung in #120
Add our ICLR2025 work Dynamic-LLaVA by @Blank-z0 in #121
🔥[MHA2MLA] Towards Economical Inference: Enabling DeepSeek’s Multi-Head Latent Attention in Any Transformer-based LLMs by @DefTruth in #122
update the title of SageAttention2 and add SpargeAttn by @jt-zhang in #123
Add DeepSeek Open Sources modules by @DefTruth in #124
Update DeepSeek/MLA Topics by @DefTruth in #125
Request to Add CacheCraft: A Relevant Work on Chunk-Aware KV Cache Reuse by @skejriwal44 in #126
🔥[X-EcoMLA] Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression by @DefTruth in #127
Add download_pdfs.py by @DefTruth in #128
Update README.md by @DefTruth in #129
Update Mooncake-v3 paper link by @DefTruth in #130

New Contributors

@Blank-z0 made their first contribution in #121
@jt-zhang made their first contribution in #123
@skejriwal44 made their first contribution in #126

Full Changelog: v2.6.13...v2.6.14

Contributors

DefTruth, skejriwal44, and 3 other contributors

Assets 2