Loading fine-tuned model built from pretrained subnetworks #10152
Replies: 9 comments 5 replies
-
https://pytorch-lightning.readthedocs.io/en/latest/common/weights_loading.html |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/save-load-model-for-inference/542 |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/how-to-load-and-use-model-checkpoint-ckpt/677 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/saving-loading-lightningmodule-with-injected-network/394 |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/saving-loading-the-model-for-inference-later/589 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hi @Programmer-RD-AI @awaelchli Thanks for the answers, I have read before the Lightning documentation about the basic usage of loading functions for pre-trained model checkpoints. This didn't answer my current problem and within the links listed, the closest discussion to mine is https://forums.pytorchlightning.ai/t/saving-loading-lightningmodule-with-injected-network/394 However this discussion was not answered and I am still looking for best practices on this matter, i.e. handling model injection in the Lightning module class and restoring such models. I would highly appreciate if you could comment on the situation I posted or ask me any additional detail if I didn't expose properly enough the problem. Thanks ! |
Beta Was this translation helpful? Give feedback.
-
@adrienchaton If you add Model1.load_from_checkpoint(ckpt_1) One less path to worry about :)
I don't have a good answer how to get the path of the right yaml file, but you could use the checkpoint path from the trainer: trainer.checkpoint_callback.best_model_path
or
trainer.checkpoint_callback.last_model_path So together: # trainer model1
...
ckpt1 = trainer.checkpoint_callback.best_model_path
# trainer model2
...
ckpt2 = trainer.checkpoint_callback.best_model_path
# train combined model
model = CombinedModel(ckpt1, ckpt2)
...
ckpt3 = trainer.checkpoint_callback.best_model_path Let me know if that's useful for you. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I would like to ask for confirmation if I get the expected behaviour please and if there would be best practices to handle the following situation.
I have two LightningModule that I call e.g. model_1 and model_2, which I pretrain separately. After saving them, I get ckpt_1,yaml_1 and ckpt_2,yaml_2 which describe their trained parameters and hyper-parameters.
Now I put them together in a model e.g. combined_model and I fine-tune them on the task of the combined_model.
At the beginning of the fine-tuning I build the model as:
→ combined_model optimizes the trainable parameters of model_1 and model_2, starting from the pretrained checkpoints, right ?
After the fine-tuning is done, I have ckpt_3 and yaml_3 which give the fine-tuned parameters and the destinations of the pretrained checkpoints used to build combined_model.
Usually I could just restore the fine-tuned model as
The problem I have is working with remote servers, the paths change in between the fine-tuning run and another test run so in the end yaml_3 point to wrong paths for ckpt_1,yaml_1 and ckpt_2,yaml_2 when I want to restore the fine-tuned combined_model.
What I do then is that I manually specify these new paths ckpt_1bis,yaml_1bis and ckpt_2bis,yaml_2bis in
→ in this case, am I for sure properly loading the fine-tuned weights of ckpt_3 and not the pretrained weights of ckpt_1bis and ckpt_2bis ?
I think so but I would like to be sure and also, are there any recommended ways to better handle this situation please ?
Thanks !
Beta Was this translation helpful? Give feedback.
All reactions