Fix worker status for unusable terminal backend responses by ushaket · Pull Request #615 · vllm-project/guidellm

ushaket · 2026-02-26T09:37:58Z

Summary

This PR fixes a scheduler correctness bug where requests could be marked as completed even when the backend resolved without a usable terminal response. It adds explicit terminal-response validation in the worker so malformed/empty terminal results are surfaced as errored with a clear diagnostic instead of being counted as successful requests.

Details

Added terminal response validation in WorkerProcess to guard final status transitions.
Updated request finalization logic so unusable terminal responses set:
- status: errored
- error message: [UNUSABLE_BACKEND_RESPONSE] backend resolved without a usable terminal response payload
Implemented GenerationResponse-aware usability criteria:
- usable if non-empty text or output_metrics.total_tokens > 0
- None terminal response is always unusable
- non-GenerationResponse fallback remains bool(response) for generic/test compatibility
Added regression tests in tests/unit/scheduler/test_worker.py for:
- no terminal response object -> errored
- empty GenerationResponse -> errored
- token-bearing GenerationResponse with empty text -> completed
Preserved existing cancellation/error flow behavior outside terminal-response validation.

Test Plan

Run targeted worker regression tests:
- uv run pytest -q tests/unit/scheduler/test_worker.py -k "terminal_response or empty_generation_response or generation_response_with_tokens or invalid_initialization"
Verify expected outcomes:
- missing terminal response is not marked completed
- empty GenerationResponse is not marked completed
- non-empty signal (output_tokens > 0) is accepted as completed
Confirm no lint issues on touched files.

Related Issues

Resolves Worker marks completed/successful when final response has no usable data #613

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Uri Shaket <ushaket@redhat.com>

ushaket · 2026-02-26T09:43:33Z

This introduce worker dependency on GenerationResponse, not sure if that's the right way to go,
The reason I added it is that the we need to somehow know what we're looking for, in the future we might have ResponseT that doesn't return text and total_tokens, so if we want to keep this check in the worker, we'll need to be aware of the different ResponseT possible

ushaket added 2 commits February 26, 2026 11:36

initial commit

fca7977

Signed-off-by: Uri Shaket <ushaket@redhat.com>

formating

9ace885

Signed-off-by: Uri Shaket <ushaket@redhat.com>

ushaket marked this pull request as draft February 26, 2026 09:38

Merge branch 'main' into fix/worker-unusable-terminal-response

c74cfbd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix worker status for unusable terminal backend responses#615

Fix worker status for unusable terminal backend responses#615
ushaket wants to merge 3 commits intovllm-project:mainfrom
ushaket:fix/worker-unusable-terminal-response

ushaket commented Feb 26, 2026

Uh oh!

ushaket commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ushaket commented Feb 26, 2026

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

ushaket commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant