[Qwen-moe] Remove the minor operation arange #2373

s-jiayang · 2025-08-14T07:56:50Z

What this PR does / why we need it?

Integrate the arange operator to reduce the time spent and improve performance

Does this PR introduce any user-facing change?

No

How was this patch tested?

vLLM version: v0.10.1.1
vLLM main: vllm-project/vllm@56dcf4e

gemini-code-assist

Code Review

This pull request refactors the creation of row_idx for MoE operations by centralizing the torch.arange call within the select_experts function. This improves code structure by removing redundant code from several expert fusion functions. The changes affect both unquantized and quantized MoE paths, and the necessary function signatures have been updated accordingly. I've found a critical syntax error in the w4a8 dynamic quantization path that needs to be fixed.

vllm_ascend/quantization/w4a8_dynamic.py

github-actions · 2025-08-14T08:21:50Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

s-jiayang · 2025-08-18T09:45:52Z

I mentioned this pr very early, and has been verified locally, and no error is reported. In order to resolve the conflict of the previous integration, the pr was re-mentioned.

tests/e2e/singlecard/ops/test_fused_moe.py

vllm_ascend/ops/layers/experts_selector.py

Signed-off-by: s30076806 <[email protected]>

codecov · 2025-08-25T06:40:57Z

Codecov Report

❌ Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.96%. Comparing base (b3fdd78) to head (8c475f9).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ops/common_fused_moe.py	0.00%	1 Missing ⚠️
vllm_ascend/quantization/w4a8_dynamic.py	0.00%	1 Missing ⚠️
vllm_ascend/quantization/w8a8_dynamic.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2373      +/-   ##
==========================================
+ Coverage   77.93%   77.96%   +0.02%     
==========================================
  Files         134      134              
  Lines       18504    18499       -5     
==========================================
+ Hits        14422    14423       +1     
+ Misses       4082     4076       -6

Flag	Coverage Δ
unittests	`77.96% <89.28%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: s30076806 <[email protected]>

gemini-code-assist bot reviewed Aug 14, 2025

View reviewed changes

vllm_ascend/quantization/w4a8_dynamic.py Outdated Show resolved Hide resolved

s-jiayang force-pushed the remove_arange branch from eb1bd29 to 0c6ac41 Compare August 14, 2025 08:02

github-actions bot added module:tests module:ops module:quantization labels Aug 14, 2025

ApsarasX mentioned this pull request Aug 18, 2025

replace arange and permute with the third output of npu_moe_gating_top_k_softmax #2418

Closed

ApsarasX reviewed Aug 18, 2025

View reviewed changes

tests/e2e/singlecard/ops/test_fused_moe.py Outdated Show resolved Hide resolved

vllm_ascend/ops/layers/experts_selector.py Show resolved Hide resolved

s-jiayang force-pushed the remove_arange branch 2 times, most recently from a23e811 to c466592 Compare August 19, 2025 08:37

wangxiyuan approved these changes Aug 22, 2025

View reviewed changes

[Qwen-moe] Remove the minor operation arange

5f4adf5

Signed-off-by: s30076806 <[email protected]>

s-jiayang force-pushed the remove_arange branch from c466592 to 5f4adf5 Compare August 22, 2025 09:06

ApsarasX approved these changes Aug 22, 2025

View reviewed changes

s-jiayang force-pushed the remove_arange branch 5 times, most recently from 0cd1812 to 7da13fc Compare August 25, 2025 06:11

[Qwen-moe] Remove the minor operation arange

6088f52

Signed-off-by: s30076806 <[email protected]>

s-jiayang force-pushed the remove_arange branch from 7da13fc to 6088f52 Compare August 25, 2025 07:15

s-jiayang added 2 commits August 25, 2025 15:26

[Qwen-moe] Remove the minor operation arange

b8197c2

Signed-off-by: s30076806 <[email protected]>

Merge branch 'vllm-project:main' into remove_arange

8c475f9

wangxiyuan merged commit 6a4ec18 into vllm-project:main Aug 27, 2025
36 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Qwen-moe] Remove the minor operation arange #2373

[Qwen-moe] Remove the minor operation arange #2373

s-jiayang commented Aug 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

s-jiayang commented Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Aug 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[Qwen-moe] Remove the minor operation arange #2373

[Qwen-moe] Remove the minor operation arange #2373

Conversation

s-jiayang commented Aug 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

s-jiayang commented Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

s-jiayang commented Aug 14, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 25, 2025 •

edited

Loading