Skip to content
Discussion options

You must be logged in to vote

validation_step runs simultaneously at the end of training.

Validation is considered part of the fitting procedure but it never runs concurrent to training.

As soon as validation_step start, the percentage of gpu memory allocated is shooting up and RuntimeError: CUDA out of memory occur. How can i fix it?

It's definitely caused by a bug either on your end or ours. Can you try and reproduce it?

You can adapt https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report_model.py to do it

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@JongbinWoo
Comment options

Answer selected by JongbinWoo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment