Skip to content

Change optimizer or lr_scheduler in resuming training without removing the global_step information #20552

@arijit-hub

Description

@arijit-hub

Description & Motivation

Hello,

In case we want to resume training from a checkpoint, but for say change the optimizer class or the lr_scheduler class, it seems the global_step becomes 0. Is there a possibility to keep the global step information and still allow the change? This would greatly benefit if in case we are using a warmup + decay lr_scheduler, and on resume training, we want to change the number of decay steps. However with this change if the global step becomes 0, the model will do a warmup at resume, which is not intended.

Pitch

No response

Alternatives

No response

Additional context

No response

cc @lantiga @Borda

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions