Skip to content

Commit 950c4b2

Browse files
[main] refactor alltoallv in fused_moe (#2487)
### What this PR does / why we need it? Refactor all2all-related fused_experts (both quantized/unquantized) into TokenDispatcherWithAll2AllV, including dispatch & combine calculation. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? E2E & UT - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@65197a5 Signed-off-by: Pr0Wh1teGivee <[email protected]>
1 parent 4af5b80 commit 950c4b2

File tree

2 files changed

+560
-107
lines changed

2 files changed

+560
-107
lines changed

0 commit comments

Comments
 (0)