Skip to content

Commit 7866762

Browse files
authored
Flex Attention: a Programming Model for Generating Optimized Attention Kernels (#146)
1 parent 6d4ed04 commit 7866762

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -280,7 +280,7 @@ python3 download_pdfs.py # The code is generated by Doubao AI
280280
|2025.03|🔥🔥[**SpargeAttention**] SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2502.18137)|[[SpargeAttn]](https://github.com/thu-ml/SpargeAttn) ![](https://img.shields.io/github/stars/thu-ml/SpargeAttn) | ⭐️⭐️ |
281281
|2025.04|🔥🔥[**MMInference**] MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention(@microsoft) | [[pdf]](https://arxiv.org/pdf/2504.16083)|[[MInference]](https://github.com/microsoft/MInference/) ![](https://img.shields.io/github/stars/microsoft/MInference) | ⭐️⭐️ |
282282
|2025.04|🔥🔥[**Sparse Frontier**] The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs (@Cohere) | [[pdf]](https://arxiv.org/pdf/2504.17768)|[[SparseFrontier]](https://github.com/PiotrNawrot/sparse-frontier) ![](https://img.shields.io/github/stars/PiotrNawrot/sparse-frontier) | ⭐️⭐️ |
283-
283+
|2024.12|🔥🔥[**Flex Attention**] FLEX ATTENTION: A PROGRAMMING MODEL FOR GENERATING OPTIMIZED ATTENTION KERNELS(@pytorch) | [[pdf]](https://arxiv.org/pdf/2412.05496)|[[attention-gym]](https://github.com/pytorch-labs/attention-gym) ![](https://img.shields.io/github/stars/pytorch-labs/attention-gym) | ⭐️⭐️ |
284284

285285

286286
### 📖KV Cache Scheduling/Quantize/Dropping ([©️back👆🏻](#paperlist))

0 commit comments

Comments
 (0)