v2.6.7

DefTruth released this 02 Dec 05:30

· 83 commits to main since this release

9f548f6

What's Changed

🔥[Star-Attention: 11x~ speedup] Star Attention: Efficient LLM Inference over Long Sequences by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/101
🔥[KV Cache Recomputation] Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/102

Full Changelog: DefTruth/Awesome-LLM-Inference@v2.6.6...v2.6.7

Contributors

DefTruth

Assets 2