Is there an easy way to test checkpoints every epoch and draw a line chart using a wandb or tensorboard logger? #16762
-
I have trained my model and saved checkpoints every epoch. After training, I realize that the validation result which is a line chart is not reliable because the size of validation set is too small. Thus, I want to use a larger validation set to validate all the checkpoints and draw a line chart. I have a solution to this problem which is to validate every checkpoint in separate runs then draw a line chart using matplotlib. However, the solution is not graceful. Is there an easy way such as a parameter of Trainer or rewriting methods of LightningModule to do this job? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Instantiating 1 trainer and validating all the checkpoints solves my problem. And the epoch and global step of training time can be resumed by this way. |
Beta Was this translation helpful? Give feedback.
Instantiating 1 trainer and validating all the checkpoints solves my problem. And the epoch and global step of training time can be resumed by this way.