Swa and checkpoint #7922

psFournier · 2021-06-10T15:39:08Z

psFournier
Jun 10, 2021

Hello,
I am trying to avoid overwriting standard weights at the end of training with the SWA callback, and instead, save both standard and SWA models in the checkpoint callback.
For now I have simply created a custom callback MySwa(StochasticWeightAveraging) and re-implemented the method transfer_weights :

def transfer_weights(src_pl_module: 'pl.LightningModule', dst_pl_module: 'pl.LightningModule'):
        src_params = src_pl_module.network.parameters()
        dst_params = dst_pl_module.swa_network.parameters()
        for src_param, dst_param in zip(src_params, dst_params):
            dst_param.detach().copy_(src_param.to(dst_param.device))

so that it writes _average_model weights to a second network (which is a deepcopy of the standard network).
Yet when I load a checkpoint and load both network and swa_network, weights in swa_networks seems to be random (while I know _average_model weights are good), as if the copy of the weights to swa_network had not made it to the checkpoint.

Any idea how to implement this combination of SWA and checkpointing? I have looked in on_save_checkpoint for callbacks, but I could not understand how it should work.
Thanks in advance

carmocca · 2021-07-07T14:39:24Z

carmocca
Jul 7, 2021

I am trying to avoid overwriting standard weights at the end of training with the SWA callback, and instead, save both standard and SWA models in the checkpoint callback.

But do you not want to apply the swa weights if training is over?

I am trying to avoid overwriting standard weights at the end of training with the SWA callback, and instead, save both standard and SWA models in the checkpoint callback.

You can implement on_save_checkpoint, where we check if we are inside the SWA epochs and save the average model weights if so.

Then on_load_checkpoint can be used to reload those weights if available.

This would be a nice addition to the implementation, so feel free to open a feature request or even start working on a PR yourself!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Swa and checkpoint #7922

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Swa and checkpoint #7922

Uh oh!

Uh oh!

psFournier Jun 10, 2021

Replies: 1 comment

Uh oh!

carmocca Jul 7, 2021

psFournier
Jun 10, 2021

carmocca
Jul 7, 2021