-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Labels
Description
Description & Motivation
Hi,
I've been trying to do full validation on step 0 but everything I've tried has failed in some way. I am aware of this stale issue but I could not re-open it so I created this one. Running full validation on step 0 is very useful for the cases where we want to finetune an already well-perfoming model.
These are the things I've tried:
- Using
trainer.validate()
Add Trainer.validate(โฆ) method to run one validation epochย #4948 fails with DDP because if you manually invoke thevalidate
method, strange things happen with the dataloaders and the DDP checks fail afterwards. - Setting
trainer.num_sanity_val_steps
to-1
so it runs the sanity check on the full validation dataset also fails. Because during the sanity checking the loggers are not properly set up. I tried various versions where I attempted to manually set the loggers before the sanity checking and even forcing them to log, but those also failed and became very unnecessarily hacky. - Tried temporarily setting
trainer.val_check_interval
to1
to force the validation to happen at step 1 at least, but then setting it back to its original value did not take any effect and the trianer kept validating at every step.
I feel like this should be easier to do and maybe I'm missing something.
Thanks in advance.
Pitch
Running validation at step 0 is important for many finetuning pipelines, and I think it should be easier to run it robustly on DDP without having to hack many things around the trainer pipeline.
Alternatives
No response
Additional context
No response
mnovosad1095