Replies: 1 comment
-
I don't see any obvious memory leak inside your model. I suspect the issue is within your training loop, i.e., the PyTorch computation graph may not be correctly freed. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am training the model using
cuda
, but with each epoch the used main memory and the swap memory keep on increasing, I triedgc.collect()
did not resolve the issue. I don't know if the leak is happening inside the model class or somewhere else, the first epoch starts with 5GB used and around fifth epoch 32GB of memory is used, then the program is killed. I am using python 3.10, torch 2.0.1+cu118 , and, torch-geometric 2.3.1. I've read somewhere else that tensors inside lists can sometimes not be released, perhaps inside thetorch.cat
function? I used the edge convolution tutorial in the docs for reference BTW. here is the code:Beta Was this translation helpful? Give feedback.
All reactions