Compute Loss After Sharing Tensor Across GPUs #7602
Answered
by
Zasder3
Zasder3
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
-
I’m currently attempting to make a Multi-GPU-supported CLIP training script, but am hitting a wall. I need to compute two matrices that are composed of whole batch statistics before I can compute loss. Namely, I need to compute the image and text embeddings of an entire batch. Only then can I compute the sub batch losses. How can I first calculate and share the whole batch matrices across GPUs before computing losses? |
Beta Was this translation helpful? Give feedback.
Answered by
Zasder3
May 20, 2021
Replies: 1 comment
-
The LightningModule method |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
Zasder3
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The LightningModule method
all_gather(Tensor)
solved it all!