Fine tune a pre-trained model twice #12560
Unanswered
zorikg
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am using a pre-trained model form Hugging Face and would like to fine tune it on dataset A, save the checkpoint, load it and continue to fine tune it on dataset B. I am able to easily do step A (dataset A) but cannot figure our what is the best way to do step B.
My code looks like this:
On step A I pass
model_name_or_path='allenai/longformer-base-4096'
and it works great. Then on step B I pass my checkpoint path asmodel_name_or_path
and I get this message:Also it seems that
load_from_checkpoint
calls the Ctor ofQaLongformer
withmodel_name_or_path='allenai/longformer-base-4096'
and loads the pre-trained model. When I observe the training, it seems to be training from scratch.I could use your help in figuring out what am I doing wrong and what is the best practice for 2 step fine-tuning a pre-trained model.
Thanks!
Zorik.
Beta Was this translation helpful? Give feedback.
All reactions