Skip to content

Commit c4ef389

Browse files
Add comment explaining max_batch_size assumption in bagel.yaml
Stage-0 max_batch_size=2 assumes single-prompt inference (1 user + 1 CFG companion). For multi-prompt batches it should scale accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 5ad0b4e commit c4ef389

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm_omni/model_executor/stage_configs/bagel.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ stage_args:
66
prompt_expand_func: vllm_omni.model_executor.stage_input_processors.bagel.expand_cfg_prompts
77
runtime:
88
devices: "0"
9+
# 2 = 1 user prompt + 1 CFG companion (text-unconditional).
10+
# For multi-prompt batches this should scale as batch_size × 2.
911
max_batch_size: 2
1012
engine_args:
1113
model_stage: thinker

0 commit comments

Comments
 (0)