Skip to content

Commit 4fa6d91

Browse files
authored
Add paper AdaKV (#86)
1 parent 53ca177 commit 4fa6d91

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
253253
|2024.08|🔥[Zero-Delay QKV Compression] Zero-Delay QKV Compression for Mitigating KV Cache and Network Bottlenecks in LLM Inference(@University of Virginia)|[[pdf]](https://arxiv.org/pdf/2408.04107)|⚠️|⭐️⭐️ |
254254
|2024.09|🔥[**AlignedKV**] AlignedKV: Reducing Memory Access of KV-Cache with Precision-Aligned Quantization(@Tsinghua University)|[[pdf]](https://arxiv.org/pdf/2409.16546)|[[AlignedKV]](https://github.com/AlignedQuant/AlignedKV) ![](https://img.shields.io/github/stars/AlignedQuant/AlignedKV.svg?style=social)|⭐️ |
255255
|2024.10|🔥[**LayerKV**] Optimizing Large Language Model Serving with Layer-wise KV Cache Management(@Ant Group)|[[pdf]](https://arxiv.org/pdf/2410.00428)|⚠️|⭐️⭐️ |
256+
|2024.10|🔥[**AdaKV**] Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference (@USTC)|[[pdf]](https://arxiv.org/abs/2407.11550)|[[AdaKV]](https://github.com/FFY0/AdaKV) ![](https://img.shields.io/github/stars/FFY0/AdaKV.svg?style=social&label=Star)|⭐️⭐️|
256257

257258

258259
### 📖Prompt/Context/KV Compression ([©️back👆🏻](#paperlist))

0 commit comments

Comments
 (0)