using dp to speed up training #6751
Unanswered
ironv
asked this question in
DDP / multi-GPU / multi-node
Replies: 1 comment 2 replies
-
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am training the model shown here using pytorch lightning. My train
dataloader
looks like thisI am trying to use
dp
and see how that helps.Questions:
batch_size
be a multiple ofnum_workers
?num_workers
should be increased and to what number without doing several experiments. I am not seeing too much of a difference when I go up to 8.ngpu=1
I get aRuntimeError: CUDA out of memory.
error. While it is good that I can train larger models, I was hoping to see a reduction in run time? Do I need to use a different setting for that?Beta Was this translation helpful? Give feedback.
All reactions