Description & Motivation
Hello,
In case we want to resume training from a checkpoint, but for say change the optimizer
class or the lr_scheduler
class, it seems the global_step
becomes 0. Is there a possibility to keep the global step information and still allow the change? This would greatly benefit if in case we are using a warmup + decay lr_scheduler, and on resume training, we want to change the number of decay steps. However with this change if the global step becomes 0, the model will do a warmup at resume, which is not intended.
Pitch
No response
Alternatives
No response
Additional context
No response
cc @lantiga @Borda