Skip to content

[Optimize]Add norm_weights feature for topk_gating_softmax #3372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 14, 2025

Conversation

Sunny-bot1
Copy link
Contributor

@Sunny-bot1 Sunny-bot1 commented Aug 13, 2025

功能支持

  • 为 topk_gating_softmax 增加norm_weights融合操作,并优先选择此kernel

性能验证

  • kernel:相比moe_softmax + moe_top_k_normed 大batch耗时下减少70%,小batch下耗时减少45%
  • 模型:Qwen3-30B-A3B FP8 50并发下单机TPS和QPS提升12%

Copy link

paddle-bot bot commented Aug 13, 2025

Thanks for your contribution!

}
int64_t tem_num_experts = num_experts;
if(bias != nullptr || apply_norm_weight) tem_num_experts = 0;
if(bias != nullptr) tem_num_experts = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里加一个注释,说明下,否则怪怪的,意思就是说当bias不空的时候,只走default逻辑

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

算子有单测的话,也增加下单测吧

@Sunny-bot1 Sunny-bot1 closed this Aug 13, 2025
@Sunny-bot1 Sunny-bot1 reopened this Aug 13, 2025
@Sunny-bot1
Copy link
Contributor Author

Sunny-bot1 commented Aug 13, 2025

算子有单测的话,也增加下单测吧

单测在上个PR里已增加 #3345

@zhoutianzi666 zhoutianzi666 merged commit 2e78311 into PaddlePaddle:develop Aug 14, 2025
16 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants