Skip to content

Commit 97bbf18

Browse files
authored
add two papers (#154)
Added the following two papers on ICLR 2025: https://arxiv.org/pdf/2502.05431 https://arxiv.org/pdf/2409.15355
1 parent f2278b3 commit 97bbf18

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -292,8 +292,8 @@ python3 download_pdfs.py # The code is generated by Doubao AI
292292
|2025.02| 🔥🔥🔥[**SeerAttention**] SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs(@microsoft) | [[pdf]](https://arxiv.org/abs/2410.13276) | [[SeerAttention]](https://github.com/microsoft/SeerAttention) ![](https://img.shields.io/github/stars/microsoft/SeerAttention.svg?style=social) | ⭐️⭐️⭐️ |
293293
|2025.03| [**Slim attention**] Slim attention: cut your context memory in half without loss of accuracy, K-cache is all you need for MHA(@OpenMachine.ai) | [[pdf]](https://arxiv.org/pdf/2503.05840) | [[OpenMchine]](https://github.com/OpenMachine-ai/transformer-tricks) ![](https://img.shields.io/github/stars/OpenMachine-ai/transformer-tricks.svg?style=social) | ⭐️⭐️⭐️ |
294294
|2025.05|🔥🔥[**SageAttention-3**] SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2505.11594)|[[SageAttention]](https://github.com/thu-ml/SageAttention) ![](https://img.shields.io/github/stars/thu-ml/SageAttention) | ⭐️⭐️ |
295-
296-
295+
|2025.04|🔥🔥[**Parallel Encoding**] APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding(@cmu.edu&NVIDIA)|[[pdf]](https://arxiv.org/pdf/2502.05431)|[[APE]](https://github.com/Infini-AI-Lab/APE) ![](https://img.shields.io/github/stars/Infini-AI-Lab/APE) | ⭐️⭐️ |
296+
|2025.04|🔥🔥[**Parallel Encoding**] Block-Attention for Efficient Prefilling(@Tencent etc)|[[pdf]](https://arxiv.org/pdf/2409.15355)|[[Block-attention]](https://github.com/TemporaryLoRA/Block-attention) ![](https://img.shields.io/github/stars/TemporaryLoRA/Block-attention) | ⭐️⭐️ |
297297

298298
### 📖KV Cache Scheduling/Quantize/Dropping ([©️back👆🏻](#paperlist))
299299
<div id="KV-Cache-Scheduling-Quantize-Dropping"></div>

0 commit comments

Comments
 (0)