[Feature] Support Stage Based Deployment CLI by wuhang2014 · Pull Request #939 · vllm-project/vllm-omni

wuhang2014 · 2026-01-25T05:00:00Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Background is described in #870.

For now, only support single node, multiprocessing:

Multiple node is not supported;
Ray backend is not supported;
DP for diffusion model is not supported;

Test Plan

model: Qwen3-Omni

deployment CLI:

stage-0

CUDA_VISIBLE_DEVICES=4,5,6,7 vllm serve /data/models/Qwen3-Omni-30B-A3B-Instruct/ --omni --stage-id 0 --data-parallel-size 2 --omni-master-address 127.0.0.1 --omni-master-port 33567

stage-1

CUDA_VISIBLE_DEVICES=2 vllm serve /data/models/Qwen3-Omni-30B-A3B-Instruct/ --omni --stage-id 1 --headless --omni-master-address 127.0.0.1 --omni-master-port 33567

stage-2

CUDA_VISIBLE_DEVICES=3 vllm serve /data/models/Qwen3-Omni-30B-A3B-Instruct/ --omni --stage-id 2 --headless --omni-master-address 127.0.0.1 --omni-master-port 33567

test script:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
          {
            "role": "user",
            "content": [
              { "type": "text", "text": "What’s in this image?" },
              {
                "type": "image_url",
                "image_url": {
                  "url": "file:///data/wuhang/dog-4988985_960_720.jpg"
                }
              }
            ]
          }
    ],
    "audio": { "voice": "alloy", "format": "wav" }
  }'

Test Result

(wuhang) (base) root@huawei:/data/wuhang/vllm-omni/examples/online_serving/qwen3_omni# python openai_chat_completion_client_for_multimodal_generation.py --query-type use_image --model /data/models/Qwen3-Omni-30B-A3B-Instruct/ --image-path /data/wuhang/dog-4988985_960_720.jpg 
Chat completion output from text: Based on the image provided, here is a detailed description of its content:

This is a professionally taken, close-up photograph of a happy dog lying in a field of green grass.

*   **Main Subject:** The central focus is a Pembroke Welsh Corgi. It has a classic tan and white coat, with tan fur covering its head, ears, and back, and white fur on its chest, neck, and muzzle.
*   **Expression and Pose:** The corgi is lying down but looking directly at the camera with an alert and joyful expression. Its mouth is open in what appears to be a smile, with its pink tongue slightly visible. Its large, erect ears are pointed forward, indicating it is attentive.
*   **Setting and Lighting:** The dog is in a lush, sunlit grassy area. The lighting suggests it's either early morning or late afternoon (golden hour), casting a warm, soft glow over the scene. The background is softly blurred (a shallow depth of field), showing out-of-focus trees and foliage, which helps to emphasize the dog as the main subject.
*   **Details:** The corgi is wearing a dark green collar around its neck.
Audio saved to audio_0.wav
(wuhang) (base) root@huawei:/data/wuhang/vllm-omni/examples/online_serving/qwen3_omni# ls -l
total 2920
-rw-r--r-- 1 root root 2918954 Jan 26 08:57 audio_0.wav
-rw-r--r-- 1 root root   19876 Jan 22 12:00 gradio_demo.py
-rw-r--r-- 1 root root   16995 Jan 25 11:14 openai_chat_completion_client_for_multimodal_generation.py
-rw-r--r-- 1 root root    1177 Jan 22 12:00 qwen3_omni_moe_thinking.yaml
-rw-r--r-- 1 root root    7166 Jan 22 12:00 README.md
-rw-r--r-- 1 root root    4359 Jan 22 12:00 run_curl_multimodal_generation.sh
-rwxr-xr-x 1 root root    6123 Jan 22 12:00 run_gradio_demo.sh
(wuhang) (base) root@huawei:/data/wuhang/vllm-omni/examples/online_serving/qwen3_omni#

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

hsliuustc0106

Silent error handling - Multiple except Exception: pass blocks
- Fix: Add logging: except Exception as e: logger.debug(f"Error: {e}")
Log spam - logger.info() in hot paths (line 1466)
- Fix: Change to logger.debug()
PR description incomplete - "Test Result" section is empty
- Fix: Add actual test output, performance metrics

vllm_omni/entrypoints/omni.py

vllm_omni/entrypoints/omni_stage.py

Copilot

Pull request overview

This PR implements stage-based deployment CLI support for vLLM-Omni, enabling independent deployment of pipeline stages across processes using ZMQ-based IPC. This is part of the larger effort described in issue #870 to support data parallelism for pipeline stages.

Changes:

Added ZMQ-based queue utilities to replace multiprocessing queues for inter-stage communication
Implemented headless mode for deploying individual stages independently
Added dynamic port allocation and handshake protocol for stage coordination

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 47 comments.

Show a summary per file

File	Description
vllm_omni/entrypoints/zmq_utils.py	New file providing ZMQ queue wrapper and handshake utilities for stage communication
vllm_omni/entrypoints/omni_stage.py	Modified to support both ZMQ and multiprocessing queues, added cleanup handlers and queue spec support
vllm_omni/entrypoints/omni.py	Added ZMQ context management, handshake server for stage coordination, and dynamic port allocation
vllm_omni/entrypoints/cli/serve.py	Added headless mode and stage-id CLI arguments for independent stage deployment
vllm_omni/entrypoints/async_omni.py	Updated cleanup handlers to support ZMQ queues
pyproject.toml	Added pyzmq>=25.0.0 dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm_omni/entrypoints/omni.py

vllm_omni/entrypoints/cli/serve.py

vllm_omni/entrypoints/omni.py

vllm_omni/entrypoints/omni_stage.py

vllm_omni/entrypoints/cli/serve.py

vllm_omni/entrypoints/zmq_utils.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff2d5c10ba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/entrypoints/omni.py

Signed-off-by: wuhang <wuhang6@huawei.com>

Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: wuhang <wuhang6@huawei.com>

Signed-off-by: wuhang <wuhang6@huawei.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>

Signed-off-by: wuhang <wuhang6@huawei.com>

princepride · 2026-02-24T03:20:26Z

@hsliuustc0106 ready for merge?

hsliuustc0106 reviewed Jan 25, 2026

View reviewed changes

vllm_omni/entrypoints/omni.py Outdated Show resolved Hide resolved

vllm_omni/entrypoints/omni_stage.py Outdated Show resolved Hide resolved

david6666666 modified the milestone: v0.14.0 Jan 26, 2026

wuhang2014 force-pushed the stagecli branch 13 times, most recently from 95fffb4 to 0e1105f Compare February 3, 2026 10:47

wuhang2014 mentioned this pull request Feb 3, 2026

[RFC]: Support DP of pipeline stage #870

Open

15 tasks

hsliuustc0106 requested a review from Copilot February 3, 2026 14:26

Copilot started reviewing on behalf of hsliuustc0106 February 3, 2026 14:34 View session

Copilot AI reviewed Feb 3, 2026

View reviewed changes

wuhang2014 mentioned this pull request Feb 4, 2026

[RFC]: Support Stage Based Deployment CLI JiusiServe/vllm-omni#102

Closed

1 task

wuhang2014 force-pushed the stagecli branch 5 times, most recently from 4e7aff3 to 9e39c1f Compare February 5, 2026 10:25

wuhang2014 marked this pull request as ready for review February 5, 2026 10:27

wuhang2014 force-pushed the stagecli branch from 9e39c1f to ff2d5c1 Compare February 5, 2026 10:27

chatgpt-codex-connector bot reviewed Feb 5, 2026

View reviewed changes

vllm_omni/entrypoints/omni.py Outdated Show resolved Hide resolved

hsliuustc0106 requested a review from Copilot February 5, 2026 15:05

wuhang2014 and others added 23 commits February 24, 2026 09:40

test fix

b2ff7ee

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

4823b78

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

480f94d

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

bd4f089

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

5df7c98

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

3f3b9ec

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

80b77cc

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

1422e0e

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

8ea892a

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

7710f5e

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

91e7959

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

94d271b

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

925a2af

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

b16e8f3

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

42df8d7

Signed-off-by: wuhang <wuhang6@huawei.com>

test fix

fdfa87f

Signed-off-by: wuhang <wuhang6@huawei.com>

fix test

18f9373

Signed-off-by: wuhang <wuhang6@huawei.com>

fix some bug (#3)

cc9bd9e

Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: wuhang <wuhang6@huawei.com>

update for review suggestions

e1e50f8

Signed-off-by: wuhang <wuhang6@huawei.com>

Update vllm_omni/entrypoints/cli/serve.py

df6019d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>

Update examples/online_serving/bagel/run_server_stage_cli.sh

4d047d2

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>

Update vllm_omni/entrypoints/omni.py

b8b02a8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>

thread join order

f2a82a2

Signed-off-by: wuhang <wuhang6@huawei.com>

wuhang2014 force-pushed the stagecli branch from 576ddfd to f2a82a2 Compare February 24, 2026 01:40

import ordering

c05ecb4

Signed-off-by: wuhang <wuhang6@huawei.com>

hsliuustc0106 approved these changes Feb 24, 2026

View reviewed changes

hsliuustc0106 merged commit 36b8f80 into vllm-project:main Feb 24, 2026
7 checks passed

lishunyang12 added a commit to lishunyang12/vllm-omni that referenced this pull request Feb 24, 2026

merge upstream main (resolve conflicts after vllm-project#939)

5595b93

ekagra-ranjan mentioned this pull request Feb 24, 2026

Add online serving to Stable Audio Diffusion and introduce v1/audio/generate endpoint #1255

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Stage Based Deployment CLI#939

[Feature] Support Stage Based Deployment CLI#939
hsliuustc0106 merged 31 commits intovllm-project:mainfrom
wuhang2014:stagecli

wuhang2014 commented Jan 25, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

princepride commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

wuhang2014 commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

princepride commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

wuhang2014 commented Jan 25, 2026 •

edited

Loading