Skip to content

Commit 04a6eb6

Browse files
fix: add prefill load balance method args for deepseek-r1 (#4051)
Co-authored-by: dagil-nvidia <[email protected]>
1 parent c837b5b commit 04a6eb6

File tree

2 files changed

+12
-6
lines changed

2 files changed

+12
-6
lines changed

recipes/deepseek-r1/sglang/disagg-16gpu/deploy.yaml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ spec:
4848
timeoutSeconds: 10
4949
failureThreshold: 600
5050
image: my-registry/sglang-wideep-runtime:my-tag
51-
workingDir: /workspace/examples/backends/sglang
51+
workingDir: /sgl-workspace/dynamo
5252
command:
5353
- python3
5454
- -m
@@ -77,6 +77,7 @@ spec:
7777
- "0.75"
7878
- --host
7979
- 0.0.0.0
80+
- --prefill-round-robin-balance
8081
prefill:
8182
dynamoNamespace: sgl-dsr1-16gpu
8283
componentType: worker
@@ -101,7 +102,7 @@ spec:
101102
timeoutSeconds: 10
102103
failureThreshold: 600
103104
image: my-registry/sglang-wideep-runtime:my-tag
104-
workingDir: /workspace/examples/backends/sglang
105+
workingDir: /sgl-workspace/dynamo
105106
command:
106107
- python3
107108
- -m
@@ -126,4 +127,6 @@ spec:
126127
- --mem-fraction-static
127128
- "0.75"
128129
- --host
129-
- 0.0.0.0
130+
- 0.0.0.0
131+
- --load-balance-method
132+
- round_robin

recipes/deepseek-r1/sglang/disagg-8gpu/deploy.yaml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ spec:
4646
timeoutSeconds: 10
4747
failureThreshold: 600
4848
image: my-registry/sglang-wideep-runtime:my-tag
49-
workingDir: /workspace/examples/backends/sglang
49+
workingDir: /sgl-workspace/dynamo
5050
command:
5151
- python3
5252
- -m
@@ -73,6 +73,7 @@ spec:
7373
- "30001"
7474
- --host
7575
- 0.0.0.0
76+
- --prefill-round-robin-balance
7677
prefill:
7778
dynamoNamespace: sgl-dsr1-8gpu
7879
componentType: worker
@@ -95,7 +96,7 @@ spec:
9596
timeoutSeconds: 10
9697
failureThreshold: 600
9798
image: my-registry/sglang-wideep-runtime:my-tag
98-
workingDir: /workspace/examples/backends/sglang
99+
workingDir: /sgl-workspace/dynamo
99100
command:
100101
- python3
101102
- -m
@@ -118,4 +119,6 @@ spec:
118119
- --disaggregation-bootstrap-port
119120
- "30001"
120121
- --host
121-
- 0.0.0.0
122+
- 0.0.0.0
123+
- --load-balance-method
124+
- round_robin

0 commit comments

Comments
 (0)