Skip to content
Discussion options

You must be logged in to vote

I think this is to be expected, right? Note that the memory is not freed here until you either detach computations from the computation graph (e.g., via predictions.append(prediction.detach()) or until you compute loss.backward(). Depending on T, this may result in OOMs.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@APJansen
Comment options

Answer selected by APJansen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants