Skip to content

Commit 33bebb2

Browse files
authored
update (#173)
1 parent 4148092 commit 33bebb2

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

blog/2025-07-25-spec-forge.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Using SpecForge, we trained the Llama 4 Scout and Maverick models on a 320K-samp
5757

5858
We evaluated various draft token lengths for Scout and Maverick.
5959

60-
In all the tests shown in the figure below, the x-axis represents steps, corresponding to `speculative-num-steps` in SGLang. Meanwhile, we fixed SGLang's `speculative-eagle-topk` to 8 and `speculative-num-draft-tokens` to 10 to ensure that `tree attention` can be enabled.
60+
In all the tests shown in the figure below, the x-axis represents steps, corresponding to `speculative-num-steps` in SGLang. Meanwhile, we fixed SGLang's `speculative-eagle-topk` to 8 and `speculative-num-draft-tokens` to 10 to ensure that `tree attention` can be enabled. To find the optimal speculative decoding parameters, we can use the `[bench_speculative](https://github.com/sgl-project/sglang/blob/main/scripts/playground/bench_speculative.py)` script in the SGLang repository. It runs throughput benchmarks across different configurations and helps us tune for the best performance on the hardware.
6161

6262
![scout.svg](/images/blog/spec_forge/Llama4_Scout_performance_final.svg)
6363

@@ -75,15 +75,15 @@ Explore our source code on GitHub and try the pre-trained models on Hugging Face
7575

7676
In the near future, we plan to extend SpecForge with the following support.
7777

78-
- Support more model architectures, including the Kimi K2 and Qwen-3 MoE.
78+
- Support more model architectures, including the Kimi K2 and Qwen-3 MoE. We’re actively collaborating with the LinkedIn Infrastructure team, who are training additional Qwen-3 MoE draft models that will be supported by SpecForge.
7979
- Integrate Vision-Language Models (VLM) into SpecForge.
8080
- Support more efficient training with better parallelism strategies and kernel optimization.
8181

8282
## Acknowledgement
8383

8484
We would like to express our heartfelt gratitude to the following teams and collaborators:
8585

86-
**SGLang Team and Community** — Shenggui Li, Yikai Zhu, Fan Yin, Chao Wang, Shuai Shi, Yi Zhang, Yingyi Huang, Haoshuai Zheng, Yineng Zhang and many others.
86+
**SGLang Team and Community** — Shenggui Li, Yikai Zhu, Fan Yin, Chao Wang, Shuai Shi, Yi Zhang, Yingyi Huang, Haoshuai Zheng, Yubo Wang, Yineng Zhang and many others.
8787

8888
**SafeAILab Team** — Yuhui Li, Hongyang Zhang and members — for their pioneering work on the Eagle3 algorithm.
8989

0 commit comments

Comments
 (0)