How to implement parallel SGD with multiple GPUs? #6844
Unanswered
shenzebang
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi guys,
I am wondering how we can implement parallel SGD efficient over multiple GPUs?$v$ SGDs on a single GPU). Now suppose that I want to run $v * p$ SGDs over $p$ GPUs. How can I do that efficiently? Do I compose pmap with vmap? This strategy seems to be a bit complicated especially if we need to be able to adapt to different $p$ .
If there is only a single GPU, I know that we could simply use vmap to parallel every SGD step (let us say we are running
I was also thinking about using pmap to simulate the Map-Reduce scheme. However, since pmap automatically jits the input function, it will unroll the SGD loops which leads to infinity long compiling time.
Best,
Zebang
Beta Was this translation helpful? Give feedback.
All reactions