Swa and checkpoint #7922
Replies: 1 comment
-
But do you not want to apply the swa weights if training is over?
You can implement Then This would be a nice addition to the implementation, so feel free to open a feature request or even start working on a PR yourself! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am trying to avoid overwriting standard weights at the end of training with the SWA callback, and instead, save both standard and SWA models in the checkpoint callback.
For now I have simply created a custom callback MySwa(StochasticWeightAveraging) and re-implemented the method transfer_weights :
so that it writes
_average_model
weights to a second network (which is adeepcopy
of the standard network).Yet when I load a checkpoint and load both network and
swa_network
, weights inswa_networks
seems to be random (while I know_average_model
weights are good), as if the copy of the weights toswa_network
had not made it to the checkpoint.Any idea how to implement this combination of SWA and checkpointing? I have looked in
on_save_checkpoint
for callbacks, but I could not understand how it should work.Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions