Don't reset optimizer progress at the end of an epoch #13199
Unanswered
philipbecker
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Principally, I have optimizers configured like this
With this configuration, Pytorch Lightning will give each training step a new batch of the dataset. This behaviour is different from returning just a tuple of optimizers, where each optimizer operates on the same batch. So far, so good.
However, it comes with an (for me) undesirable behavior, namely, that lightning resets the internal optimizer loop at the end of every epoch. In our example above, we need at least 4 batches in the dataset, otherwise the last optimizer(s) will never be executed. More generally, if the number of batches in our dataset is not divisible by 4, then the actual frequency of optimizers does not match the defined frequency because the last cycle over the optimizers will be incomplete.
Maybe this is desirable default behaviour, but for me, I don't care at all about epochs during training, I just want to cycle over these optimizers in an even fashion. Since I do some active learning research, my datasets are often small-ish (at least in the beginning of training), this is an important issue for me. I thought by writing my own
TraininigEpochLoop
/OptimizerLoop
/OptimizerProgress
class I could easily fix this, but so far, I have failed to truly understand the workflow.Here is a colab that demonstrates this problem: https://colab.research.google.com/drive/1LUXjm8r4MODLkkTkPlnjGaxyPtkkXaBj?usp=sharing
I tried removing/modifying certain reset calls in my own
Loop
s or overwrite how theOptimizerProgress
' state is set, but to no avail. Could someone give me some pointers how I can achieve my desired behaviour?Beta Was this translation helpful? Give feedback.
All reactions