[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut #2352

lilinsiman · 2025-08-13T07:28:18Z

[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut

What this PR does / why we need it?

This PR solves the problem of sizes capture and stream error caused by using ACLgraph on the Qwen3-30B MOE model.
Add new ut.

Does this PR introduce any user-facing change?

no

How was this patch tested?

ut

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@504d914

github-actions · 2025-08-13T07:28:30Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request addresses an issue with sizes capture for the Qwen3-MOE model in aclgraph mode and adds new unit tests to cover this scenario. The changes in vllm_ascend/utils.py adjust the calculation for maximum batch sizes based on the HCCL_OP_EXPANSION_MODE environment variable. My review focuses on improving the new tests for better isolation and enhancing the readability and maintainability of the calculation logic by addressing magic numbers and style issues. The proposed changes will make the tests more robust and the code easier to understand.

tests/e2e/multicard/test_qwen3_moe.py

vllm_ascend/utils.py

wangxiyuan · 2025-08-22T07:31:39Z

vllm_ascend/utils.py

-                                     (num_hidden_layers + 1) / parallel_factor)
-    logger.info("Calculated maximum supported batch sizes for ACL graph: %s",
-                max_num_batch_sizes)
+    if envs.HCCL_OP_EXPANSION_MODE == 'AIV':


please rebase to main, it's envs_ascend https://github.com/vllm-project/vllm-ascend/blob/main/vllm_ascend/utils.py#L34

MengqingCao · 2025-08-22T07:47:34Z

I think this also fix #2229

Signed-off-by: lilinsiman <[email protected]>

wangxiyuan · 2025-08-22T08:57:47Z

vllm_ascend/envs.py

@@ -55,6 +55,9 @@
    # Please make sure that the version is correct.
    "SOC_VERSION":
    lambda: os.getenv("SOC_VERSION", "ASCEND910B1"),
+    # location for orchestrated deployment of communication algorithms.


this is a env from HCCL, we should not add it in vllm-ascend. we can set it in docker file and mention it in doc.

codecov · 2025-08-22T10:34:45Z

Codecov Report

❌ Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.32%. Comparing base (950c4b2) to head (6706000).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/utils.py	62.50%	3 Missing ⚠️

❌ Your patch status has failed because the patch coverage (62.50%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2352      +/-   ##
==========================================
- Coverage   78.33%   78.32%   -0.02%     
==========================================
  Files         132      132              
  Lines       17778    17783       +5     
==========================================
+ Hits        13926    13928       +2     
- Misses       3852     3855       +3

Flag	Coverage Δ
unittests	`78.32% <62.50%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions bot added module:tests module:core labels Aug 13, 2025

gemini-code-assist bot reviewed Aug 13, 2025

View reviewed changes

tests/e2e/multicard/test_qwen3_moe.py Show resolved Hide resolved

tests/e2e/multicard/test_qwen3_moe.py Show resolved Hide resolved

vllm_ascend/utils.py Outdated Show resolved Hide resolved

vllm_ascend/utils.py Outdated Show resolved Hide resolved

lilinsiman force-pushed the main branch from d41a26d to 1eee250 Compare August 13, 2025 07:56

yiz-liu suggested changes Aug 13, 2025

View reviewed changes

vllm_ascend/utils.py Outdated Show resolved Hide resolved

vllm_ascend/utils.py Outdated Show resolved Hide resolved

vllm_ascend/utils.py Outdated Show resolved Hide resolved

lilinsiman force-pushed the main branch 4 times, most recently from 4414d64 to a84e9a6 Compare August 21, 2025 06:58

wangxiyuan reviewed Aug 22, 2025

View reviewed changes

fix_A3_ACLgraph_sizes_capture_bug_and_add_new_ut

cdbe054

Signed-off-by: lilinsiman <[email protected]>

lilinsiman force-pushed the main branch from a84e9a6 to cdbe054 Compare August 22, 2025 08:26

wangxiyuan reviewed Aug 22, 2025

View reviewed changes

Merge branch 'vllm-project:main' into main

6706000

lilinsiman closed this Aug 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut #2352

[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut #2352

Uh oh!

lilinsiman commented Aug 13, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wangxiyuan Aug 22, 2025

Uh oh!

MengqingCao commented Aug 22, 2025

Uh oh!

wangxiyuan Aug 22, 2025

Uh oh!

codecov bot commented Aug 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut #2352

[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut #2352

Uh oh!

Conversation

lilinsiman commented Aug 13, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Aug 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wangxiyuan Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao commented Aug 22, 2025

Uh oh!

wangxiyuan Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

lilinsiman commented Aug 13, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 22, 2025 •

edited

Loading