-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
checkpointingRelated to checkpointingRelated to checkpointingdocsDocumentation relatedDocumentation related
Description
📚 Documentation
There's a lot of documentation out there about using the resume_from_checkpoint
keyword in a pytorch trainer however this is wrong. In the latest pytorch version, one needs to provide the path to the checkpoint (.ckpt file) itself in the fit function for the trainer to get it going. here's some popular incorrect references -
- https://stackoverflow.com/questions/71961436/pytorch-lightning-resuming-from-checkpoint-with-new-data
- https://lightning.ai/forums/t/how-to-resume-training/432
- Resume training from checkpoint with new data #12845
- https://www.youtube.com/watch?v=V5KGEzIwAxQ
ChatGPT and claude also got this wrong:
I wanted this to get visibility because knowing how to resume training from checkpoints is imperative and there's a lot of wrong information out there!
cc @Borda @awaelchli
Metadata
Metadata
Assignees
Labels
checkpointingRelated to checkpointingRelated to checkpointingdocsDocumentation relatedDocumentation related