Skip to content

Commit f2d8493

Browse files
authored
[BugFix] Fix ascend scheduler assert error (#3191)
### What this PR does / why we need it? Running multimodal model with ascend scheduler may cause assert error 【assert (request.num_tokens - request.num_computed_tokens) == 1】 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@17b4c66 --------- Signed-off-by: fan2956 <[email protected]>
1 parent 68c5401 commit f2d8493

File tree

1 file changed

+10
-9
lines changed

1 file changed

+10
-9
lines changed

vllm_ascend/core/scheduler.py

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -208,15 +208,16 @@ def skip_cur_request():
208208
assert num_new_tokens > 0
209209
blocks = new_computed_blocks.blocks[0]
210210

211-
# Schedule encoder inputs.
212-
if request.has_encoder_inputs:
213-
(encoder_inputs_to_schedule, num_new_tokens,
214-
new_encoder_budget) = self._try_schedule_encoder_inputs(
215-
request, num_computed_tokens, num_new_tokens,
216-
encoder_budget)
217-
if num_new_tokens == 0:
218-
# The request cannot be scheduled.
219-
break
211+
# Schedule encoder inputs.
212+
if request.has_encoder_inputs:
213+
(encoder_inputs_to_schedule, num_new_tokens,
214+
new_encoder_budget) = self._try_schedule_encoder_inputs(
215+
request, num_computed_tokens, num_new_tokens,
216+
encoder_budget)
217+
if num_new_tokens == 0 or len(
218+
encoder_inputs_to_schedule) == 0:
219+
# The request cannot be scheduled.
220+
break
220221

221222
watermark = getattr(self.scheduler_config, "watermark", 0.01)
222223
if not self._check_watermark_for_prefill(request, num_new_tokens,

0 commit comments

Comments
 (0)