trying to overfit on a single image, but train/val losses and metrics are not the same #12178
Unanswered
pini-kop
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment 5 replies
-
depends. Do you have any batch_norm or dropouts in your model? you can also debug this by checking the batch and outputs of your model in each of these hooks. if outputs are different then there must be something that's different during train/eval mode. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, perhaps I'm missing something..
I have a segmentation task and I'm trying to overfit on a single image but the train, validation and test losses (on the same image) are not equal to one each other (also the metrics I'm calculating). the model is able to overfit, just that the numbers are a bit different.
I built a dummy sampler that returns a constant index and giving it to the dataloader. and I'm not applying any transforms (except albuminations.ToTensorV2)
train_sampler = DummySampler(main_indices=[5])
train_dl = DataLoader(train_ds, 1, sampler=train_sampler, drop_last=True, num_workers=os.cpu_count(), pin_memory=True)
The same dataloder is used in the Trainer for both train and for validation,
trainer.fit(model, train_dl, train_dl)
the train_step, validation_step and test_step are identical inside the model.
I also tried to use overfit_batches flag but got the same thing - different numbers for train and validation.
am I missing something, in this setup shouldn't the losses be equal? and metrics scores?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions