How would one accomplish distributed training for SimCLR? (i.e whole batch needs to be aggregated before calculating the loss) #18312
Unanswered
jeffwillette
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I looked through the simclr example here (https://lightning.ai/docs/pytorch/stable/notebooks/course_UvA-DL/13-contrastive-learning.html) and noticed that it shows training SimCLR on a single node. PL seems like it is straightforward to extend to multiple GPU's, but this example will calculate the loss on each GPU individually, whereas SimCLR would ideally aggregate all the output features together before calculating the loss on the entire distributed batch size.
Is there a straightforward way to accomplish this with PL?
Beta Was this translation helpful? Give feedback.
All reactions