feat: support DeepSeek V3.2 W4A8 MoE for mlu and add smoke test. by phantomlei3 · Pull Request #969 · jd-opensource/xllm

phantomlei3 · 2026-03-02T07:41:27Z

No description provided.

gemini-code-assist

Code Review

This pull request adds support for DeepSeek V3.2 W4A8 MoE quantization for MLU, including necessary changes in model loading, quantization argument handling, and the FusedMoE layer implementation. It also includes new smoke tests to verify the functionality. The changes are well-structured and the new feature is accompanied by tests. I have one suggestion to improve the robustness of quantization argument loading to prevent a subtle bug.

gemini-code-assist · 2026-03-02T07:45:59Z

xllm/core/framework/hf_model_loader.cpp

    if (auto v = reader.value<int64_t>("quantization_config.bits")) {
      quant_args_.bits() = v.value();
    }
+    quant_args_.moe_weight_bits() = quant_args_.bits();


The current logic for setting moe_weight_bits can lead to unintended behavior. If quantization_config exists but does not contain bits, quant_args_.bits() will be its default value of 0, and quant_args_.moe_weight_bits() will be incorrectly set to 0, overriding its own default of 8. This can cause failures later on in a non-obvious way.

To make the logic more robust and explicit, moe_weight_bits should only be updated from bits when bits is explicitly specified in the configuration. This ensures moe_weight_bits retains its sensible default value if bits is not specified.

Suggested change

if (auto v = reader.value<int64_t>("quantization_config.bits")) {

quant_args_.bits() = v.value();

}

quant_args_.moe_weight_bits() = quant_args_.bits();

if (auto v = reader.value<int64_t>("quantization_config.bits")) {

quant_args_.bits() = v.value();

quant_args_.moe_weight_bits() = v.value();

}

only set moe_weight_bits when quantization_config.bits is present

yq33victor

LGTM

phantomlei3 requested review from DongheJin, JimHsiung, RobbieLeung, XuZhang99, liutongxuan, walsonyang and yq33victor as code owners March 2, 2026 07:41

gemini-code-assist bot reviewed Mar 2, 2026

View reviewed changes

yq33victor previously approved these changes Mar 4, 2026

View reviewed changes

feat: support DeepSeek V3.2 W4A8 MoE for mlu and add smoke test.

c175a2d

phantomlei3 force-pushed the feat/ds-v32-w4a8 branch from 6e55719 to c175a2d Compare March 4, 2026 15:22

bugfix: based on reviews.

73b9cfd

phantomlei3 dismissed yq33victor’s stale review via 73b9cfd March 4, 2026 15:33

bugfix:fix unit test

58bd138

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support DeepSeek V3.2 W4A8 MoE for mlu and add smoke test.#969

feat: support DeepSeek V3.2 W4A8 MoE for mlu and add smoke test.#969
phantomlei3 wants to merge 3 commits intojd-opensource:mainfrom
phantomlei3:feat/ds-v32-w4a8

phantomlei3 commented Mar 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 2, 2026

Uh oh!

phantomlei3 Mar 4, 2026

Uh oh!

yq33victor left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

phantomlei3 commented Mar 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

phantomlei3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

yq33victor left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants