feat(platforms): add MetaX MACA support and vllm-upstream extra by Dayuxiaoshui · Pull Request #2596 · vllm-project/vllm-omni

Dayuxiaoshui · 2026-04-08T14:02:15Z

Add MacaOmniPlatform on vllm_metax with maca workers (CUDA-compatible path).
Add OmniPlatformEnum.MACA, forward_maca / is_maca branches for diffusion and int8.

Purpose

Add first-class MetaX (MACA) support in vLLM-Omni, mirroring the existing MUSA-style platform layout: MacaOmniPlatform built on vllm_metax.platform.MacaPlatform, dedicated worker entrypoints, and platform auto-detection that avoids colliding with NVIDIA CUDA when NVML reports discrete NVIDIA GPUs.
Extend install routing for maca (requirements/maca.txt, setup.py / VLLM_OMNI_TARGET_DEVICE=maca, and maca-before-cuda heuristic when vllm_metax is importable).
Wire diffusion / int8 dispatch to is_maca() and forward_maca() (aligned with other platforms instead of folding maca into is_cuda()).
Add optional vllm-upstream extra (vllm==0.19.0) for environments that install upstream vLLM from PyPI; document that it must not be mixed with vllm-metax in the same environment.

Test Plan

Syntax / packaging: python -m compileall -q vllm_omni (and python -m build with VLLM_OMNI_TARGET_DEVICE=cpu if validating wheel metadata).
Unit (CPU, no GPU): pytest tests/distributed/omni_connectors/test_kv_flow.py -q — exercises omni connector logic without requiring MACA hardware.
MACA (vendor stack, post-merge validation on MetaX machines): Install aligned mcPyTorch + vLLM + vLLM-metax per MetaX docs, then pip install -e . with VLLM_OMNI_TARGET_DEVICE=maca (or rely on auto-detection), and smoke-run vllm serve <model> --omni or a minimal omni offline script from docs.

Reason we do not add MACA-specific CI in this PR: full correctness depends on vendor images and vLLM-metax version alignment, which cannot be reproduced in generic upstream CI without dedicated runners.

Test Result

Local / headless: compileall OK; pytest tests/distributed/omni_connectors/test_kv_flow.py 18 passed (CPU, no libcuda).
MACA hardware / vLLM-metax: not executed in this PR’s default CI environment; to be confirmed on MetaX CI or a MetaX-provided container with matching vllm_metax + vLLM.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs. (Not done in this PR; MACA install path is vendor-documented. Follow-up: add a short GPU install tab for MACA if maintainers want parity with MUSA.)
(Optional) Release notes update. If your change is user-facing, please update the release notes draft. (Optional follow-up if release process requires a user-facing line for MetaX.)

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

- Add MacaOmniPlatform on vllm_metax with maca workers (CUDA-compatible path). - Register maca platform plugin; setup.py detects maca before generic CUDA; add requirements/maca.txt. - Add OmniPlatformEnum.MACA, forward_maca / is_maca branches for diffusion and int8. - Add optional dependency extra vllm-upstream (vllm==0.19.0) for PyPI CUDA stacks; do not mix with vllm-metax in the same env.

chatgpt-codex-connector · 2026-04-08T14:02:21Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Remove vllm-upstream dependency and update notes. Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>

Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>

…detection (drop undefined _nvidia_nvml_gpu_count_for_setup; maca = vllm_metax + torch.cuda only; use VLLM_OMNI_TARGET_DEVICE if multiple plugins match)

Dayuxiaoshui requested a review from hsliuustc0106 as a code owner April 8, 2026 14:02

Dayuxiaoshui and others added 5 commits April 8, 2026 22:05

Update pyproject.toml to exclude vllm-upstream

eae4d72

Remove vllm-upstream dependency and update notes. Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>

Update setup.py

1f34f09

Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>

Update __init__.py

2a7983e

Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>

fix(maca): remove NVML helper from MACA plugin and fix setup.py MACA …

cac28ba

…detection (drop undefined _nvidia_nvml_gpu_count_for_setup; maca = vllm_metax + torch.cuda only; use VLLM_OMNI_TARGET_DEVICE if multiple plugins match)

style(setup): two blank lines between top-level functions (ruff format)

989ba54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(platforms): add MetaX MACA support and vllm-upstream extra#2596

feat(platforms): add MetaX MACA support and vllm-upstream extra#2596
Dayuxiaoshui wants to merge 6 commits intovllm-project:mainfrom
Dayuxiaoshui:main

Dayuxiaoshui commented Apr 8, 2026

Uh oh!

chatgpt-codex-connector bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dayuxiaoshui commented Apr 8, 2026

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant