@@ -38,7 +38,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
3838
3939## 📖Contents
4040* 📖[ Trending LLM/VLM Topics] ( #Trending-LLM-VLM-Topics ) 🔥🔥🔥
41- * 📖[ Multi-head Latent Attention(MLA)] ( #mla ) 🔥🔥🔥
41+ * 📖[ DeepSeek/ Multi-head Latent Attention(MLA)] ( #mla ) 🔥🔥🔥
4242* 📖[ DP/MP/PP/TP/SP/CP Parallelism] ( #DP-MP-PP-TP-SP-CP ) 🔥🔥🔥
4343* 📖[ Disaggregating Prefill and Decoding] ( #P-D-Disaggregating ) 🔥🔥🔥
4444* 📖[ LLM Algorithmic/Eval Survey] ( #LLM-Algorithmic-Eval-Survey )
@@ -75,7 +75,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
7575| 2025.01| 🔥🔥🔥 [ ** MiniMax-Text-01** ] MiniMax-01: Scaling Foundation Models with Lightning Attention | [[ report]] ( https://filecdn.minimax.chat/_Arxiv_MiniMax_01_Report.pdf ) | [[ MiniMax-01]] ( https://github.com/MiniMax-AI/MiniMax-01 ) ![ ] ( https://img.shields.io/github/stars/MiniMax-AI/MiniMax-01.svg?style=social ) | ⭐️⭐️ |
7676| 2025.01| 🔥🔥🔥[ ** DeepSeek-R1** ] DeepSeek-R1 Technical Report(@deepseek-ai ) | [[ pdf]] ( https://arxiv.org/pdf/2501.12948v1 ) | [[ DeepSeek-R1]] ( https://github.com/deepseek-ai/DeepSeek-R1 ) ![ ] ( https://img.shields.io/github/stars/deepseek-ai/DeepSeek-R1.svg?style=social ) | ⭐️⭐️ |
7777
78- ### 📖Multi-head Latent Attention(MLA) ([ ©️back👆🏻] ( #paperlist ) )
78+ ### 📖DeepSeek/ Multi-head Latent Attention(MLA) ([ ©️back👆🏻] ( #paperlist ) )
7979<div id =" mla " ></div >
8080
8181| Date| Title| Paper| Code| Recom|
@@ -84,7 +84,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
8484| 2024.12| 🔥🔥🔥[ ** DeepSeek-V3** ] DeepSeek-V3 Technical Report(@deepseek-ai ) | [[ pdf]] ( https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf ) | [[ DeepSeek-V3]] ( https://github.com/deepseek-ai/DeepSeek-V3 ) ![ ] ( https://img.shields.io/github/stars/deepseek-ai/DeepSeek-V3.svg?style=social ) | ⭐️⭐️ |
8585| 2025.01| 🔥🔥🔥[ ** DeepSeek-R1** ] DeepSeek-R1 Technical Report(@deepseek-ai ) | [[ pdf]] ( https://arxiv.org/pdf/2501.12948v1 ) | [[ DeepSeek-R1]] ( https://github.com/deepseek-ai/DeepSeek-R1 ) ![ ] ( https://img.shields.io/github/stars/deepseek-ai/DeepSeek-R1.svg?style=social ) | ⭐️⭐️ |
8686| 2025.02| 🔥🔥🔥[ ** TransMLA** ] TransMLA: Multi-head Latent Attention Is All You Need(@PKU )| [[ pdf]] ( https://arxiv.org/pdf/2502.07864 ) | [[ TransMLA]] ( https://github.com/fxmeng/TransMLA ) ![ ] ( https://img.shields.io/github/stars/fxmeng/TransMLA.svg?style=social ) | ⭐️⭐️ |
87-
87+ | 2025.02 | 🔥🔥🔥 [ ** DeepSeek-NSA ** ] Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention( @ deepseek-ai ) | [[ pdf ]] ( https://arxiv.org/pdf/2502.11089 ) | ⚠️ | ⭐️⭐️ |
8888
8989### 📖DP/MP/PP/TP/SP/CP Parallelism ([ ©️back👆🏻] ( #paperlist ) )
9090<div id =" DP-MP-PP-TP-SP-CP " ></div >
0 commit comments