v2.6.8

DefTruth released this 09 Dec 01:22

· 81 commits to main since this release

32fdb84

What's Changed

🔥[ClusterKV] ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/103
🔥[BatchLLM] BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/104

Full Changelog: DefTruth/Awesome-LLM-Inference@v2.6.7...v2.6.8

Contributors

DefTruth

Assets 2