Skip to content

Commit 8377c78

Browse files
CStanKonradNvidia
andauthored
Update README.md - add KVTC (#156)
Co-authored-by: Nvidia <[email protected]>
1 parent c0fb761 commit 8377c78

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -371,6 +371,7 @@ python3 download_pdfs.py # The code is generated by Doubao AI
371371
|2024.09|🔥🔥[**CRITIPREFILL**] CRITIPREFILL: A SEGMENT-WISE CRITICALITYBASED APPROACH FOR PREFILLING ACCELERATION IN LLMS(@OPPO) | [[pdf]](https://arxiv.org/pdf/2409.12490) | [CritiPrefill](https://github.com/66RING/CritiPrefill) ![](https://img.shields.io/github/stars/66RING/CritiPrefill.svg?style=social)|⭐️ |
372372
|2024.10|🔥🔥[**KV-COMPRESS**] PAGED KV-CACHE COMPRESSION WITH VARIABLE COMPRESSION RATES PER ATTENTION HEAD(@Cloudflare, inc.)| [[pdf]](https://arxiv.org/pdf/2410.00161) | [vllm-kvcompress](https://github.com/IsaacRe/vllm-kvcompress) ![](https://img.shields.io/github/stars/IsaacRe/vllm-kvcompress.svg?style=social)|⭐️⭐️ |
373373
|2024.10|🔥🔥[**LORC**] Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy(@gatech.edu)|[[pdf]](https://arxiv.org/pdf/2410.03111)|⚠️ |⭐️⭐️ |
374+
|2025.11|🔥🔥[**KVTC**] KV Cache Transform Coding for Compact Storage in LLM Inference (@NVIDIA)|[[pdf]](https://arxiv.org/pdf/2511.01815)|⚠️|⭐️⭐️ |
374375

375376
### 📖Long Context Attention/KV Cache Optimization ([©️back👆🏻](#paperlist))
376377
<div id="Long-Context-Attention-KVCache"></div>

0 commit comments

Comments
 (0)