You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: blog/2025-07-25-spec-forge.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,7 +57,7 @@ Using SpecForge, we trained the Llama 4 Scout and Maverick models on a 320K-samp
57
57
58
58
We evaluated various draft token lengths for Scout and Maverick.
59
59
60
-
In all the tests shown in the figure below, the x-axis represents steps, corresponding to `speculative-num-steps` in SGLang. Meanwhile, we fixed SGLang's `speculative-eagle-topk` to 8 and `speculative-num-draft-tokens` to 10 to ensure that `tree attention` can be enabled.
60
+
In all the tests shown in the figure below, the x-axis represents steps, corresponding to `speculative-num-steps` in SGLang. Meanwhile, we fixed SGLang's `speculative-eagle-topk` to 8 and `speculative-num-draft-tokens` to 10 to ensure that `tree attention` can be enabled. To find the optimal speculative decoding parameters, we can use the `[bench_speculative](https://github.com/sgl-project/sglang/blob/main/scripts/playground/bench_speculative.py)` script in the SGLang repository. It runs throughput benchmarks across different configurations and helps us tune for the best performance on the hardware.
@@ -75,15 +75,15 @@ Explore our source code on GitHub and try the pre-trained models on Hugging Face
75
75
76
76
In the near future, we plan to extend SpecForge with the following support.
77
77
78
-
- Support more model architectures, including the Kimi K2 and Qwen-3 MoE.
78
+
- Support more model architectures, including the Kimi K2 and Qwen-3 MoE. We’re actively collaborating with the LinkedIn Infrastructure team, who are training additional Qwen-3 MoE draft models that will be supported by SpecForge.
79
79
- Integrate Vision-Language Models (VLM) into SpecForge.
80
80
- Support more efficient training with better parallelism strategies and kernel optimization.
81
81
82
82
## Acknowledgement
83
83
84
84
We would like to express our heartfelt gratitude to the following teams and collaborators:
85
85
86
-
**SGLang Team and Community** — Shenggui Li, Yikai Zhu, Fan Yin, Chao Wang, Shuai Shi, Yi Zhang, Yingyi Huang, Haoshuai Zheng, Yineng Zhang and many others.
86
+
**SGLang Team and Community** — Shenggui Li, Yikai Zhu, Fan Yin, Chao Wang, Shuai Shi, Yi Zhang, Yingyi Huang, Haoshuai Zheng, Yubo Wang, Yineng Zhang and many others.
87
87
88
88
**SafeAILab Team** — Yuhui Li, Hongyang Zhang and members — for their pioneering work on the Eagle3 algorithm.
0 commit comments