Release v2.0 · xlite-dev/Awesome-LLM-Inference

What's Changed

🔥🔥[LUT TENSOR CORE] Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/33
🔥🔥[Eigen Attention] Attention in Low-Rank Space for KV Cache Compression by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/34
KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/35
Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/36
🔥[ABQ-LLM] Arbitrary-Bit Quantized Inference Acceleration for Large Language Models by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/37
[Token Recycling] Turning Trash into Treasure: Accelerating Inference… by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/38
Bump up to v2.0 by @DefTruth in https://github.com/DefTruth/Awesome-LLM-Inference/pull/39

Full Changelog: DefTruth/Awesome-LLM-Inference@v1.9...v2.0