Skip to content

Commit 1606cff

Browse files
Merge pull request #265 from runpod-workers/fix/zero-max-model-num_batches
fix: check for zero param and set to None
2 parents b749aa5 + e705c94 commit 1606cff

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

src/engine_args.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,13 @@ def get_engine_args():
288288

289289
# Set max_num_batched_tokens to max_model_len for unlimited batching.
290290
# vLLM defaults max_num_batched_tokens to 2048 when None, which is too low.
291+
292+
if args.get("max_model_len") == 0:
293+
args["max_model_len"] = None
294+
295+
if args.get("max_num_batched_tokens") == 0:
296+
args["max_num_batched_tokens"] = None
297+
291298
if args.get("max_num_batched_tokens") is None:
292299
max_model_len = args.get("max_model_len")
293300
if max_model_len is None:

0 commit comments

Comments
 (0)