How to checkpoint on validation metrics in Pytorch lightning? #7123
Unanswered
noamzilo
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment
-
Hi! I am not able to reproduce the issue. This is what I tried: def test_bug(tmpdir):
class TestModel(BoringModel):
def training_step(self, batch, batch_idx):
self.log("train_batch_idx", -batch_idx, on_step=False, on_epoch=True)
return super().training_step(batch, batch_idx)
def validation_step(self, batch, batch_idx):
self.log("val_batch_idx", -batch_idx, on_step=False, on_epoch=True)
return super().validation_step(batch, batch_idx)
model = TestModel()
mc = ModelCheckpoint(dirpath=tmpdir, monitor="val_batch_idx", save_top_k=3)
trainer = Trainer(default_root_dir=tmpdir, progress_bar_refresh_rate=0, max_epochs=3, callbacks=[mc])
trainer.fit(model)
print(mc.best_k_models) What Lightning version are you using? Can you provide a minimal reproducible example? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Stackoverflow mirror
When using the following code fragment to log metrics:
and
I verified that
__common_epoch_end_report
is indeed entered both withmode='train'
and withmode='validation'
.However, only the metrics logged from
train
are available for checkpointing:Getting the following error:
How to allow check pointing by validation metrics in Pytorch-lightning?
Beta Was this translation helpful? Give feedback.
All reactions