add op moe_align_block_size & batched_moe_align_block_size #54

mayuyuace · 2025-10-27T08:31:50Z

Add op moe_align_block_size & batched_moe_align_block_size.
From: https://github.com/vllm-project/vllm/blob/main/csrc/moe/moe_align_sum_kernels.cu

Signed-off-by: mayuyuace <[email protected]>

Copilot

Pull Request Overview

This PR adds two new MOE (Mixture of Experts) operations: moe_align_block_size and batched_moe_align_block_size, which align token distribution across experts to be compatible with block sizes for matrix multiplication. The implementation is adapted from vLLM's MOE alignment kernels.

Key changes:

Implements SYCL/XPU kernels for MOE token alignment with block size constraints
Adds comprehensive test coverage for both regular and batched alignment scenarios
Includes utility function for rounding up to block size multiples

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/utils.py	Adds `round_up` utility function for block size calculations
tests/test_moe_align_block_size.py	Comprehensive test suite covering various scenarios including determinism, expert mapping, and edge cases
tests/register_ops.py	Registers the two new MOE alignment operations with PyTorch
tests/ops/moe_align_block_size_ops.py	Python wrappers for the MOE alignment operations with detailed documentation
csrc/moe/torch_bindings.cpp	Binds the C++ implementations to PyTorch operators
csrc/moe/moe_ops.h	Declares the function signatures for the MOE alignment operations
csrc/moe/moe_align_sum_kernels.cpp	Implements the SYCL kernels for MOE token alignment operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-10-27T08:32:26Z

tests/test_moe_align_block_size.py

+):
+    """
+    Verify that actual_sorted_ids follows the correct expert-level sorting.
+    The kerne limplementation may or may not preserve original token order


Corrected spelling of 'kerne limplementation' to 'kernel implementation'.

Suggested change

The kerne limplementation may or may not preserve original token order

The kernel implementation may or may not preserve original token order

csrc/moe/moe_align_sum_kernels.cpp

jikunshang

LGTM!

jikunshang · 2025-10-28T00:13:51Z

csrc/moe/moe_align_sum_kernels.cpp

+    int32_t* temp_storage = static_cast<int32_t*>(
+        slm.template get_multi_ptr<sycl::access::decorated::no>().get());
+
+    int32_t* shared_counts = temp_storage + 1024;


why 1024 here? please avoid use magic number.

Temp_storage needs 1024 int32 space.
CUDA does not need to display the declaration, but SYCL needs to display the declaration of this part of SLM.

mayuyuace added 2 commits October 27, 2025 01:24

add op moe_align_block_size & batched_moe_align_block_size

c572b74

Signed-off-by: mayuyuace <[email protected]>

add opcheck

1184e2d

Signed-off-by: mayuyuace <[email protected]>

Copilot AI review requested due to automatic review settings October 27, 2025 08:31

Copilot AI reviewed Oct 27, 2025

View reviewed changes

jikunshang approved these changes Oct 27, 2025

View reviewed changes

jikunshang reviewed Oct 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

add op moe_align_block_size & batched_moe_align_block_size #54

add op moe_align_block_size & batched_moe_align_block_size #54

mayuyuace commented Oct 27, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 27, 2025

Uh oh!

Uh oh!

jikunshang left a comment

Uh oh!

jikunshang Oct 28, 2025

Uh oh!

mayuyuace Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	The kerne limplementation may or may not preserve original token order
	The kernel implementation may or may not preserve original token order

Uh oh!

add op moe_align_block_size & batched_moe_align_block_size #54

Are you sure you want to change the base?

add op moe_align_block_size & batched_moe_align_block_size #54

Conversation

mayuyuace commented Oct 27, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

jikunshang Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

mayuyuace Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants