Commit bc09379
fix: use H200 default chunked-prefill-size of 8192
4096 was below SGLang's auto-detected default for H200 GPUs (<160GB),
which unnecessarily limited prefill throughput.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 2be0dac commit bc09379
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
| 57 | + | |
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| |||
0 commit comments