Skip to content

CUDA OOM during validation of first epoch #10959

Discussion options

You must be logged in to vote

Dear @mishooax,

You are returning the batch from the validation_step, which would be stored. As it is currently on the GPU, after X batches, you would get a OOM.

Unless you need the batch on epoch end, I would recommend to not return anything from the validation_step.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@mishooax
Comment options

Answer selected by mishooax
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment