[Feature] Support Stage Based Deployment CLI#939
[Feature] Support Stage Based Deployment CLI#939hsliuustc0106 merged 31 commits intovllm-project:mainfrom
Conversation
hsliuustc0106
left a comment
There was a problem hiding this comment.
-
Silent error handling - Multiple
except Exception: passblocks- Fix: Add logging:
except Exception as e: logger.debug(f"Error: {e}")
- Fix: Add logging:
-
Log spam -
logger.info()in hot paths (line 1466)- Fix: Change to
logger.debug()
- Fix: Change to
-
PR description incomplete - "Test Result" section is empty
- Fix: Add actual test output, performance metrics
95fffb4 to
0e1105f
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements stage-based deployment CLI support for vLLM-Omni, enabling independent deployment of pipeline stages across processes using ZMQ-based IPC. This is part of the larger effort described in issue #870 to support data parallelism for pipeline stages.
Changes:
- Added ZMQ-based queue utilities to replace multiprocessing queues for inter-stage communication
- Implemented headless mode for deploying individual stages independently
- Added dynamic port allocation and handshake protocol for stage coordination
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 47 comments.
Show a summary per file
| File | Description |
|---|---|
| vllm_omni/entrypoints/zmq_utils.py | New file providing ZMQ queue wrapper and handshake utilities for stage communication |
| vllm_omni/entrypoints/omni_stage.py | Modified to support both ZMQ and multiprocessing queues, added cleanup handlers and queue spec support |
| vllm_omni/entrypoints/omni.py | Added ZMQ context management, handshake server for stage coordination, and dynamic port allocation |
| vllm_omni/entrypoints/cli/serve.py | Added headless mode and stage-id CLI arguments for independent stage deployment |
| vllm_omni/entrypoints/async_omni.py | Updated cleanup handlers to support ZMQ queues |
| pyproject.toml | Added pyzmq>=25.0.0 dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4e7aff3 to
9e39c1f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ff2d5c10ba
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: wuhang <whlbx@hotmail.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
|
@hsliuustc0106 ready for merge? |
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Background is described in #870.
For now, only support single node, multiprocessing:
Test Plan
model: Qwen3-Omni
deployment CLI:
test script:
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)