The main process hangs every time #12562
Unanswered
ShaneTian
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment
-
If I remove
but eight log
So, I do not ensure that Same as predict loop. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is the main code about the issue:
1.6.0
DeepSpeed
ZeRO stage 2But every time, the process hangs here:
I found that the GPU memory of 0/7th is 300M less than other cards.
What happened?
Beta Was this translation helpful? Give feedback.
All reactions