Using val_dataloader
after training on multiple gpus seems to return the batches gpu-count
times
#15357
Unanswered
daMichaelB
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment 1 reply
-
Hi @daMichaelB! Do you have a full script that reproduces the behaviour? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
i trained a model on multiple GPU's with the following approach:
The training on all 4 GPUs works perfect and uses almost 100% GPU.
After the training i want to compute the losses on the validation-set per sample and did it like
Problem
As long as i train on one GPU this works fine. However since using 4 GPUs i get
It seems the
val_dataloader
now holds the dataset 4 times.I think i am doing something completely wrong, but do not really find a solution. I am thankful for any kind of advice. Thank you
Dependencies
Beta Was this translation helpful? Give feedback.
All reactions