Skip to content

Fix for ascend scheduler hang#332

Merged
arginugaTT merged 1 commit intodevfrom
arg/fix-scheduler-hang
Feb 27, 2026
Merged

Fix for ascend scheduler hang#332
arginugaTT merged 1 commit intodevfrom
arg/fix-scheduler-hang

Conversation

@arginugaTT
Copy link

@arginugaTT arginugaTT commented Feb 27, 2026

Purpose

Fixing hanged benchmark pipeline if multiple similar images are used

Test Plan

I see an issue with this condition in vllm 1.0
https://github.com/tenstorrent/vllm/blob/dev/vllm/v1/core/sched/ascend_scheduler.py#L261

Extra condition len(encoder_inputs_to_schedule) == 0: When a VLM request's encoder output is already cached (from a previous request with the same image), _try_schedule_encoder_inputs correctly returns an empty list (nothing needs recomputation) while keeping num_new_tokens > 0. The base Scheduler correctly schedules the request using cached encoder outputs. The AscendScheduler incorrectly treats "nothing to compute" as "can't schedule" and breaks.

break instead of continue: The base Scheduler uses continue to skip to the next request. The AscendScheduler uses break to stop scheduling entirely, blocking ALL subsequent requests in the queue forever.

Result: when a VLM request at the head of the waiting queue has a cached encoder output (same image processed earlier), the scheduler permanently deadlocks -- it can never schedule that request or any request behind it.

Fix

Correcting the condition to if num_new_tokens == 0:

@arginugaTT arginugaTT merged commit 38dee8c into dev Feb 27, 2026
4 checks passed
@arginugaTT arginugaTT deleted the arg/fix-scheduler-hang branch February 27, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants