Not able to save a new part of the model with save_checkpoint #10356
Unanswered
alessiabertugli
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I have to load a pre-trained model and add a part of the model to complete the training process. The model is an autoencoder composed of an encoder, a latent code and a decoder. I have to add a new decoder that is initialised as a copy of the original decoder. I can train the whole model with both the decoders, but when I tried to save the checkpoint using the function save_checkpoint, the state_dict does not contain the new decoder parameters. I added the new decoder as follow:
I debugged the checkpoint_connector.py from pytorch-lightning and I found that the model contains the new decoder, but its parameters are not present calling the state_dict function as follow:
Looking at the state_dict function (from /torch/nn/modules/module.py) I see that self._modules contains the parameters of the new decoder but I can't understand why they are not put into the destination dictionary. Can anyone help me with this issue, please?
Beta Was this translation helpful? Give feedback.
All reactions