KeyError: 'Trying to restore training state but checkpoint contains only the model. This is probably due to ModelCheckpoint.save_weights_only
being set to True
.'
#9745
Answered
by
rohitgr7
morestart
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
-
this is my code: checkpoint_callback = ModelCheckpoint(
monitor='hmean',
mode='max',
dirpath='../weights',
filename='DB-{epoch:02d}-{hmean:.2f}',
save_last=True,
save_weights_only=True,
)
when i try resume train from checkpoint, i got this error: KeyError: 'Trying to restore training state but checkpoint contains only the model. This is probably due to |
Beta Was this translation helpful? Give feedback.
Answered by
rohitgr7
Sep 29, 2021
Replies: 1 comment
-
if you set |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
morestart
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
if you set
save_weights_only=True
inModelCheckpoint
then it won't save optimizer/scheduler states in an ideal case. So assigning this checkpoint to resume training won't work because it needs to restore optimizer/scheduler state as well to actually resume it.