Skip to content

Commit 0dbdd54

Browse files
committed
add gpu performance result
1 parent add945d commit 0dbdd54

File tree

3 files changed

+4
-4
lines changed

3 files changed

+4
-4
lines changed

blog/2025-10-29-sglang-jax.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -73,11 +73,11 @@ We benchmarked SGLang-Jax against vLLM-TPU. Full instructions are available [her
7373
We used `Qwen/Qwen3-32B`, TPU v6e-4, SGLang-jax (version: main-af32f095880ff676ed23eec19bc79584b5e20717), and vLLM-tpu (vllm-tpu==0.11.1).
7474

7575
### Results
76-
<img src="/images/blog/sglang_jax/performance_results.png" style="display:block; margin: auto; width: 85%;"></img>
76+
<img src="/images/blog/sglang_jax/tpu_performance.png" style="display:block; margin: auto; width: 85%;"></img>
7777
<p style="color:gray; text-align: center;">match vllm-tpu on prefill because of similar kernel optimizations. outperform vllm-tpu on decode thanks to overlap scheduler. </p>
7878

79-
Todo:
80-
- (optional) show some TPUs vs. GPUs.
79+
<img src="/images/blog/sglang_jax/gpu_performance.png" style="display:block; margin: auto; width: 85%;"></img>
80+
<p style="color:gray; text-align: center;">the TPU setup achieves lower latency (TTFT and ITL) and higher input throughput across various batch sizes</p>
8181

8282
## Usage
8383

@@ -156,7 +156,7 @@ The community is working with Google Cloud team and multiple partners on the fol
156156
- Multi-LoRA batching
157157

158158
## Acknowledgments
159-
**SGLang-jax team**: sii-xinglong, jimoosciuc, Prayer, aolemila, JamesBrianD, zkkython, neo, leos, pathfinder-pf, Ying Sheng, Hongzhen Chen, Jiacheng Yang, Ke Bao
159+
**SGLang-jax team**: sii-xinglong, jimoosciuc, Prayer, aolemila, JamesBrianD, zkkython, neo, leos, pathfinder-pf, Ying Sheng, Hongzhen Chen, Jiacheng Yang, Ke Bao, Qinghan Chen
160160

161161
**Google**: Google Cloud Team
162162

460 KB
Loading
File renamed without changes.

0 commit comments

Comments
 (0)