Skip to content

feat(platforms): add MetaX MACA support and vllm-upstream extra#2596

Open
Dayuxiaoshui wants to merge 6 commits intovllm-project:mainfrom
Dayuxiaoshui:main
Open

feat(platforms): add MetaX MACA support and vllm-upstream extra#2596
Dayuxiaoshui wants to merge 6 commits intovllm-project:mainfrom
Dayuxiaoshui:main

Conversation

@Dayuxiaoshui
Copy link
Copy Markdown

  • Add MacaOmniPlatform on vllm_metax with maca workers (CUDA-compatible path).

  • Add OmniPlatformEnum.MACA, forward_maca / is_maca branches for diffusion and int8.

Purpose

  • Add first-class MetaX (MACA) support in vLLM-Omni, mirroring the existing MUSA-style platform layout: MacaOmniPlatform built on vllm_metax.platform.MacaPlatform, dedicated worker entrypoints, and platform auto-detection that avoids colliding with NVIDIA CUDA when NVML reports discrete NVIDIA GPUs.
  • Extend install routing for maca (requirements/maca.txt, setup.py / VLLM_OMNI_TARGET_DEVICE=maca, and maca-before-cuda heuristic when vllm_metax is importable).
  • Wire diffusion / int8 dispatch to is_maca() and forward_maca() (aligned with other platforms instead of folding maca into is_cuda()).
  • Add optional vllm-upstream extra (vllm==0.19.0) for environments that install upstream vLLM from PyPI; document that it must not be mixed with vllm-metax in the same environment.

Test Plan

  • Syntax / packaging: python -m compileall -q vllm_omni (and python -m build with VLLM_OMNI_TARGET_DEVICE=cpu if validating wheel metadata).
  • Unit (CPU, no GPU): pytest tests/distributed/omni_connectors/test_kv_flow.py -q — exercises omni connector logic without requiring MACA hardware.
  • MACA (vendor stack, post-merge validation on MetaX machines): Install aligned mcPyTorch + vLLM + vLLM-metax per MetaX docs, then pip install -e . with VLLM_OMNI_TARGET_DEVICE=maca (or rely on auto-detection), and smoke-run vllm serve <model> --omni or a minimal omni offline script from docs.

Reason we do not add MACA-specific CI in this PR: full correctness depends on vendor images and vLLM-metax version alignment, which cannot be reproduced in generic upstream CI without dedicated runners.

Test Result

  • Local / headless: compileall OK; pytest tests/distributed/omni_connectors/test_kv_flow.py 18 passed (CPU, no libcuda).
  • MACA hardware / vLLM-metax: not executed in this PR’s default CI environment; to be confirmed on MetaX CI or a MetaX-provided container with matching vllm_metax + vLLM.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs. (Not done in this PR; MACA install path is vendor-documented. Follow-up: add a short GPU install tab for MACA if maintainers want parity with MUSA.)
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft. (Optional follow-up if release process requires a user-facing line for MetaX.)

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

- Add MacaOmniPlatform on vllm_metax with maca workers (CUDA-compatible path).
- Register maca platform plugin; setup.py detects maca before generic CUDA; add requirements/maca.txt.
- Add OmniPlatformEnum.MACA, forward_maca / is_maca branches for diffusion and int8.
- Add optional dependency extra vllm-upstream (vllm==0.19.0) for PyPI CUDA stacks; do not mix with vllm-metax in the same env.
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Dayuxiaoshui and others added 5 commits April 8, 2026 22:05
Remove vllm-upstream dependency and update notes.

Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>
Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>
Signed-off-by: Dayuxiaoshui <158081477+Dayuxiaoshui@users.noreply.github.com>
…detection (drop undefined _nvidia_nvml_gpu_count_for_setup; maca = vllm_metax + torch.cuda only; use VLLM_OMNI_TARGET_DEVICE if multiple plugins match)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant