Trying to access global rank or world size freezes my script #17295
Unanswered
EvanZ
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've been trying to figure out how to get
IterableDataset
to work in a multi-GPU instance, but I'm very confused by the discussions and code examples I've seen, so that has been difficult. But even something very basic like just trying to print out the GPU rank during training seems to give me problems:I have
training_step
defined in my LightningModel class. Just adding this one print statement freezes the script as soon as it looks like it's going to initialize the processes. You can see where I hit CMD+C here:Am I calling it wrong or calling it from an incorrect location that would cause this behavior?
Beta Was this translation helpful? Give feedback.
All reactions