using EMA with model checkpoints #11276

maxmatical · 2021-12-29T21:59:22Z

maxmatical
Dec 29, 2021

I'm trying to incorporate the pytorch_ema library into the PL training loop. I found one topic relating to using pytorch_ema in lightning in this discussion thread, but how would this work if i want to save a model checkpoint based on the EMA weights? for example if i want to save the model weights using just pytorch, i could do something like

# using accuracy as an example
if current_val_acc >= best_val_acc:
    with ema.average_parameters():
        torch.save(model.state_dict(), saved_model_pth)

so that i save the smoothed weights, but restore the original weights to the model so it doesn't affect training

one workaround i can think of is to create my own model saving logic in the validation_epoch_end instead of relying on the ModelCheckpoint callback, but that seems to be a bit hacky. are there any potentially better solutions?

Answered by rohitgr7

Dec 31, 2021

you can replace the model state_dict inside the checkpoint

class LitModel(LightningModule):
    ...
    
    def on_save_checkpoint(self, checkpoint):
        with ema.average_parameters():
            checkpoint['state_dict'] = self.state_dict()

View full answer

rohitgr7 · 2021-12-31T20:01:41Z

rohitgr7
Dec 31, 2021

you can replace the model state_dict inside the checkpoint

class LitModel(LightningModule):
    ...
    
    def on_save_checkpoint(self, checkpoint):
        with ema.average_parameters():
            checkpoint['state_dict'] = self.state_dict()

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

using EMA with model checkpoints #11276

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

using EMA with model checkpoints #11276

Uh oh!

maxmatical Dec 29, 2021

Replies: 1 comment

Uh oh!

rohitgr7 Dec 31, 2021

maxmatical
Dec 29, 2021

rohitgr7
Dec 31, 2021