replace moe checkpoint dp_world_size with seq_dp_world_size #7732
replace moe checkpoint dp_world_size with seq_dp_world_size #7732stas00 merged 2 commits intodeepspeedai:masterfrom
Conversation
|
Why are you proposing to do that? If you need |
cause DeepSpeed/deepspeed/runtime/engine.py Line 3908 in 37ad0c0 There is no need to redefine seq_dp_world_size here. |
|
Thank you for explaining, then yes, it checks out. @sfc-gh-truwase, I know you weren't involved in DS-MoE - but shouldn't DeepSpeed/deepspeed/runtime/engine.py Line 3891 in 37ad0c0 DeepSpeed/deepspeed/runtime/engine.py Line 3822 in 37ad0c0 only |
Replace moe checkpoint dp_world_size with seq_dp_world_size to sup moe module with seq parallel.