Resampling a Dataset every epoch through the LightningModule on_train_epoch_start hook #17190
Unanswered
Kominaru
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I'm trying to confirm my understanding of how Trainer and DataModules work for a somewhat complex PU Learning task I'm training with.
I have defined my DataModule with what I'm guessing is a standard structure:
where self.train_dataset is a Dataset subclass with a "resampling" method, as the learning strategy requires resampling the data every epoch:
I run my training loop with
and try to perform said resampling using the
LightningModule.on_train_epoch_start
hook this way:I assume there is quite a bit of object referencing going on here, so can I be sure the datamodule's train Dataset is actually resampled and updated for all Dataloaders, especially if I'm training with
num_workers > 0
? Due to the characteristics of the data it is not really feasible to check this by hand, so I thought someone would have the proper knowledge here.Thanks for your help!
Beta Was this translation helpful? Give feedback.
All reactions