v2.6.8
What's Changed
- 🔥[ClusterKV] ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/103
- 🔥[BatchLLM] BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/104
Full Changelog: DefTruth/Awesome-LLM-Inference@v2.6.7...v2.6.8