How to extend training from a checkpoint of a completed run #12297
Replies: 2 comments
-
I found another discussion that suggests to load the weights instead of using the parameter in the Trainer. But I still feel that this should be possible to do with trainer as well. I don't see how max_epoch is overrided to 1 |
Beta Was this translation helpful? Give feedback.
-
hey @bsridatta ! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have trained a model with max_epochs as 20. Now I want to continue it for even more epochs. When I pass the ckpt to
resume_from_checkpoint
/ckpt_path
and setmax_epoch
to 30, it complains thatmax_epoch
was set to 1 and that its less than 20. Is this an expected behaviour as the training is complete? Or is there anything else to be done to make it work. I have seen couple closed PRs regarding this but didn't get a clear answer. Thanks!pytorch_lightning.utilities.exceptions.MisconfigurationException: You restored a checkpoint with current_epoch=20, but you have set Trainer(max_epochs=1).
Beta Was this translation helpful? Give feedback.
All reactions