How to balance the GPU load #13128
Unanswered
YanhaoWu
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment
-
@WYHbxer We have a similar issue reported in #12651 where they reported that the memory usage on the 0th GPU was significantly higher than the other GPUs. Eventually, it turned out that their script had some manual calls of Do you have any call of Also, in general, I'd strongly recommend using one of more recent releases of PyTorch Lightning since there're heaps of bugfixes/improvements since the version ( Just realised that you cross-posted this question in an issue #13135. (When you do this, please paste its link here as well to avoid scattering information.) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When I tried to use 8 3090 cards for training, I found that the first card used more memory. Just like this
Due to the limitation of card 1, I can't transfer my batch_size increases. Is there any way to balance the load on multiple cards?
I used the following code to start the training
I am using pytorch-lightning==1.1.8
Beta Was this translation helpful? Give feedback.
All reactions