RPC: Megatron-LM support #8686
Unanswered
leezu
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Pytorch-Lightning has a great support for ZeRO via DeepSpeed and Fairscale plugins. Unfortunately ZeRO-3, runs slower than ZeRO2, it hard to keep good throuhgputs when scaling to very large models. Megatron-LM on the other hand can sustain similar throughputs as ZeRO-2 and Megatron-LM integration would be useful in Pytorch-Lightning.
I see #8101 nuked the existing RPC plugin. I'd like to start a discussion to add them back, perhaps supporting both torch 1.9 Pipe and Megatron-LM?
Beta Was this translation helpful? Give feedback.
All reactions