Skip to content

Commit 81eea3d

Browse files
xw285cornellywang96Roger Wang
authored
vllm fix check on max vocab size (#22471)
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
1 parent 9701352 commit 81eea3d

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

vllm/v1/engine/processor.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -470,7 +470,19 @@ def _validate_model_input(
470470
else:
471471
tokenizer = self.tokenizer.get_lora_tokenizer(lora_request)
472472
max_input_id = max(prompt_ids, default=0)
473-
if max_input_id > tokenizer.max_token_id:
473+
474+
# NOTE: tokenizer.max_token_id is the tokenizer’s vocab size while
475+
# self.model_config.get_vocab_size() is the model’s vocab size.
476+
# For Qwen3 models, the language model has extra tokens that do
477+
# not exist in the tokenizer, and vice versa for multimodal
478+
# placeholder tokens in some multimodal models.
479+
# See https://github.com/QwenLM/Qwen3/issues/29#issuecomment-1933720399 # noqa: E501
480+
# and https://github.com/vllm-project/vllm/pull/22471#discussion_r2312251421 # noqa: E501
481+
482+
# Here we take the max of the two to determine if a token id is
483+
# truly out-of-vocabulary.
484+
if max_input_id > max(tokenizer.max_token_id,
485+
self.model_config.get_vocab_size() - 1):
474486
raise ValueError(
475487
f"Token id {max_input_id} is out of vocabulary")
476488

0 commit comments

Comments
 (0)