Skip to content

Commit 7d153bd

Browse files
authored
🔥[SageAttention-3] Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training (#147)
1 parent c051564 commit 7d153bd

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,8 @@ python3 download_pdfs.py # The code is generated by Doubao AI
281281
|2024.12|🔥🔥[**Flex Attention**] FLEX ATTENTION: A PROGRAMMING MODEL FOR GENERATING OPTIMIZED ATTENTION KERNELS(@pytorch) | [[pdf]](https://arxiv.org/pdf/2412.05496)|[[attention-gym]](https://github.com/pytorch-labs/attention-gym) ![](https://img.shields.io/github/stars/pytorch-labs/attention-gym) | ⭐️⭐️ |
282282
|2025.02| 🔥🔥🔥[**SeerAttention**] SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs(@microsoft) | [[pdf]](https://arxiv.org/abs/2410.13276) | [[SeerAttention]](https://github.com/microsoft/SeerAttention) ![](https://img.shields.io/github/stars/microsoft/SeerAttention.svg?style=social) | ⭐️⭐️⭐️ |
283283
|2025.03| [**Slim attention**] Slim attention: cut your context memory in half without loss of accuracy, K-cache is all you need for MHA(@OpenMachine.ai) | [[pdf]](https://arxiv.org/pdf/2503.05840) | [[OpenMchine]](https://github.com/OpenMachine-ai/transformer-tricks) ![](https://img.shields.io/github/stars/OpenMachine-ai/transformer-tricks.svg?style=social) | ⭐️⭐️⭐️ |
284+
|2025.05|🔥🔥[**SageAttention-3**] SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2505.11594)|[[SageAttention]](https://github.com/thu-ml/SageAttention) ![](https://img.shields.io/github/stars/thu-ml/SageAttention) | ⭐️⭐️ |
285+
284286

285287

286288
### 📖KV Cache Scheduling/Quantize/Dropping ([©️back👆🏻](#paperlist))

0 commit comments

Comments
 (0)