-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Fix LR not being correctly set after using LearningRateFinder
callback
#21068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LR not being correctly set after using LearningRateFinder
callback
#21068
Conversation
…en used as callback Previously, LearningRateFinder applied the suggested LR before restoring the checkpoint, so the optimizer LR was reverted by the restore step. This caused the callback to print “Learning rate set to …” without persisting the change. Change: - Move LR application to after checkpoint restore and update both the LM attr and active optimizer param groups so the LR persists for training. Tests: - Add unit test [test_lr_finder_callback_applies_lr_after_restore] to assert the optimizer LR matches the LR Finder suggestion after the search completes. Fixes Lightning-AI#21030
LearningRateFinder
callback
Co-authored-by: Nicki Skafte Detlefsen <[email protected]>
for more information, see https://pre-commit.ci
Added #16787 to issue description as it will be fixed by this PR |
Error: The timeout of 60 minutes has triggered but not all required jobs were passing. This job will need to be re-run to merge your PR. If you do not have write access to the repository you can ask Lightning-AI/lai-frameworks to re-run it for you. If you have any other questions, you can reach out to carmocca for help. The failed check of "Probot/required-jobs" seems a timeout and has nothing to do with the fix. Can you retrigger the check ? |
yes this is an aggregation check which, expects that all particular checks finish within an hour so in case something takes longer, it needs to be restarted |
…ack (#21068) * fix(tuner/lr_finder): apply LR suggestion after checkpoint restore when used as callback Previously, LearningRateFinder applied the suggested LR before restoring the checkpoint, so the optimizer LR was reverted by the restore step. This caused the callback to print “Learning rate set to …” without persisting the change. Change: - Move LR application to after checkpoint restore and update both the LM attr and active optimizer param groups so the LR persists for training. Tests: - Add unit test [test_lr_finder_callback_applies_lr_after_restore] to assert the optimizer LR matches the LR Finder suggestion after the search completes. * changelog * Apply suggestions from code review --------- Co-authored-by: Nicki Skafte Detlefsen <[email protected]> Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit 3ed9d4e)
…ack (#21068) * fix(tuner/lr_finder): apply LR suggestion after checkpoint restore when used as callback Previously, LearningRateFinder applied the suggested LR before restoring the checkpoint, so the optimizer LR was reverted by the restore step. This caused the callback to print “Learning rate set to …” without persisting the change. Change: - Move LR application to after checkpoint restore and update both the LM attr and active optimizer param groups so the LR persists for training. Tests: - Add unit test [test_lr_finder_callback_applies_lr_after_restore] to assert the optimizer LR matches the LR Finder suggestion after the search completes. * changelog * Apply suggestions from code review --------- Co-authored-by: Nicki Skafte Detlefsen <[email protected]> Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit 3ed9d4e)
Title
fix(tuner/lr_finder): apply LR suggestion after checkpoint restore when used as callback [Fixes #21030]
Fixes #16787
Summary
Tests
Files changed
Breaking changes
Docs/Changelog
Fixes
📚 Documentation preview 📚: https://pytorch-lightning--21068.org.readthedocs.build/en/21068/