Update 2025-07-17-mtp.md

merrymercy · web-flow · commit 7df82333a0e0 · 2025-07-21T00:27:44.000-07:00
diff --git a/blog/2025-07-17-mtp.md b/blog/2025-07-17-mtp.md
@@ -7,7 +7,7 @@ previewImg: /images/blog/mtp/thumbnail_3.png
 
 ## TL;DR
 
-SGLang is the **first and only** open-source serving framework to support **Multiple Token Prediction (MTP)** in combination with **Large-Scale Expert Parallelism (EP)** and **Prefill-Decode disaggregation**. This integration delivers **up to 60% higher output throughput** through a new decoding paradigm, better parallelism, and more efficient resource utilization without sacrificing generation quality. If you are serving models, e.g., DeepSeek V3, SGLang now supports MTP as a plug-and-play feature, unlocking immediate performance gains. You can find instruction for reproduction [here](https://github.com/sgl-project/sglang/issues/7998).
+SGLang now supports smooth combination of these advanced features: **Multiple Token Prediction (MTP)**, **Large-Scale Expert Parallelism (EP)**, and **Prefill-Decode disaggregation**. This integration delivers **up to 60% higher output throughput** through a new decoding paradigm, better parallelism, and more efficient resource utilization without sacrificing generation quality. If you are serving models, e.g., DeepSeek V3, SGLang now supports MTP as a plug-and-play feature, unlocking immediate performance gains. You can find instruction for reproduction [here](https://github.com/sgl-project/sglang/issues/7998).
 
 SGLang’s inference framework running on NVIDIA GPUs enables AI practitioners to easily deliver inference at scale, empowering end users to “think smart” and harness the reasoning capabilities of state-of-the-art language models at the highest performance.