Learning Rate finder too strong loss smoothing #13404
Unanswered
hcgasser
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The learning rate finder slowly increases the learning rate during its search process and records how the loss reacts to it. My understanding is that in theory, it is supposed to stay quite constant at the beginning and then decrease before a too high learning rate leads to divergence.
However, in the callback method _LRCallback.on_batch_end, a smoothed loss is calculated (link below). The problem here is in my opinion, that the smoothing starts with an initial self.avg_loss of zero. This leads to the counterintuitive behavior that the loss increases at first with learning rate. if the number of tested learning rates is low, this can actually be the case for a wide range of learning rate values - in particular as the standard beta value is set very high (high weight to past).
I think, the self.avg_loss value should be set to the initial value of the un-smoothed loss at the beginning instead of zero. What do you think?
Thank you for looking into this
https://github.com/Lightning-AI/lightning/blob/b84b02400a312240a6429c186cc63514eeb45a82/pytorch_lightning/trainer/lr_finder.py#L374
Beta Was this translation helpful? Give feedback.
All reactions