Skip to content

Normalize gates on expert dim before calculating seq_aux_loss#11160

Open
lshpku wants to merge 1 commit intoPaddlePaddle:dsv3_devfrom
lshpku:fix-aux-loss
Open

Normalize gates on expert dim before calculating seq_aux_loss#11160
lshpku wants to merge 1 commit intoPaddlePaddle:dsv3_devfrom
lshpku:fix-aux-loss

Commits

Commits on Nov 3, 2025