Add The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs (#145)

PiotrNawrot · web-flow · commit 6d4ed04ce447 · 2025-05-05T19:06:57.000+08:00
Add The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
diff --git a/README.md b/README.md
@@ -279,6 +279,7 @@ python3 download_pdfs.py # The code is generated by Doubao AI
 |2025.01|🔥🔥[**FFPA**] FFPA: Yet another Faster Flash Prefill Attention with O(1) SRAM complexity for headdim > 256, ~1.5x faster than SDPA EA(@xlite-dev)|[[docs]](https://github.com/xlite-dev/ffpa-attn-mma)| [[ffpa-attn-mma]](https://github.com/xlite-dev/ffpa-attn-mma) ![](https://img.shields.io/github/stars/xlite-dev/ffpa-attn-mma)|⭐️⭐️ |
 |2025.03|🔥🔥[**SpargeAttention**] SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2502.18137)|[[SpargeAttn]](https://github.com/thu-ml/SpargeAttn) ![](https://img.shields.io/github/stars/thu-ml/SpargeAttn) | ⭐️⭐️ |
 |2025.04|🔥🔥[**MMInference**] MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention(@microsoft) | [[pdf]](https://arxiv.org/pdf/2504.16083)|[[MInference]](https://github.com/microsoft/MInference/) ![](https://img.shields.io/github/stars/microsoft/MInference) | ⭐️⭐️ |
+|2025.04|🔥🔥[**Sparse Frontier**] The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs (@Cohere) | [[pdf]](https://arxiv.org/pdf/2504.17768)|[[SparseFrontier]](https://github.com/PiotrNawrot/sparse-frontier) ![](https://img.shields.io/github/stars/PiotrNawrot/sparse-frontier) | ⭐️⭐️ |