Skip to content

Commit 301cc21

Browse files
authored
🔥🔥[DistServe] DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving (#114)
1 parent b117b3c commit 301cc21

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
3939
## 📖Contents
4040
* 📖[Trending LLM/VLM Topics](#Trending-LLM-VLM-Topics)🔥🔥🔥
4141
* 📖[DP/MP/PP/TP/SP/CP Parallelism](#DP-MP-PP-TP-SP-CP)🔥🔥🔥
42+
* 📖[Disaggregating Prefill and Decoding](#P-D-Disaggregating)🔥🔥🔥
4243
* 📖[LLM Algorithmic/Eval Survey](#LLM-Algorithmic-Eval-Survey)
4344
* 📖[LLM Train/Inference Framework/Design](#LLM-Train-Inference-Framework)
4445
* 📖[Weight/Activation Quantize/Compress](#Weight-Activation-Quantize-Compress)🔥
@@ -92,6 +93,13 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
9293
|2024.11|🔥🔥🔥[**SP: Star-Attention, 11x~ speedup**] Star Attention: Efficient LLM Inference over Long Sequences(@NVIDIA)|[[pdf]](https://arxiv.org/pdf/2411.17116)|[[Star-Attention]](https://github.com/NVIDIA/Star-Attention) ![](https://img.shields.io/github/stars/NVIDIA/Star-Attention.svg?style=social)|⭐️⭐️ |
9394
|2024.12|🔥🔥[**SP: TokenRing**] TokenRing: An Efficient Parallelism Framework for Infinite-Context LLMs via Bidirectional Communication(@SJTU) |[[pdf]](https://arxiv.org/pdf/2412.20501)|[[token-ring]](https://github.com/ACA-Lab-SJTU/token-ring) ![](https://img.shields.io/github/stars/ACA-Lab-SJTU/token-ring.svg?style=social)|⭐️⭐️ |
9495

96+
### 📖Disaggregating Prefill and Decoding ([©️back👆🏻](#paperlist))
97+
<div id="P-D-Disaggregating"></div>
98+
99+
|Date|Title|Paper|Code|Recom|
100+
|:---:|:---:|:---:|:---:|:---:|
101+
|2024.01|🔥🔥[DistServe] DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving(@PKU)|[[pdf]](https://arxiv.org/pdf/2401.09670)|[[DistServe]](https://github.com/LLMServe/DistServe) ![](https://img.shields.io/github/stars/LLMServe/DistServe.svg?style=social) |⭐️⭐️ |
102+
95103
### 📖LLM Algorithmic/Eval Survey ([©️back👆🏻](#paperlist))
96104
<div id="LLM-Algorithmic-Eval-Survey"></div>
97105

0 commit comments

Comments
 (0)