Skip to content

Commit 38751cb

Browse files
phlrainphlrain
andauthored
fix mlp bw split bug (#10932)
Co-authored-by: phlrain <[email protected]>
1 parent e661a3e commit 38751cb

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

paddlenlp/transformers/deepseek_v2/modeling_pp.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -870,7 +870,8 @@ def forward_backward(self, inputs, output_grad, combine_bw_event_to_wait=None, p
870870
dispatch_backward_event = deep_ep.get_event_from_comm_stream(self.backward_node.moe_group.id)
871871

872872
paddle.base.core.nvprof_nvtx_push("dispatch_backward_dw")
873-
self.backward_node.mlp_backward_dw()
873+
WeightGradStore.pop()
874+
assert WeightGradStore.funcs_queue.empty()
874875
paddle.base.core.nvprof_nvtx_pop()
875876

876877
dispatch_forward_event.calc_stream_wait(self.forward_node.moe_group.id)

0 commit comments

Comments
 (0)