Replies: 2 comments 2 replies
-
If you are hitting CUDA OOM with 48GB, there is definitely something suspicious going on. Are you sure you are not holding the whole dataset in GPU memory? Are you using |
Beta Was this translation helpful? Give feedback.
-
Thank you so much for your reply. Our dataset has over 10K large graphs and each graph has over 200K nodes. I believe I hold the whole dataset and I did not use I am not familiar with Is it correct? Is
Thank you. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Thank you so much for this helpful package. May I ask how to use multi GPUs to train one GNN model with PyG? My task is node regression on a large homogenous undirected graph.
The batch size is 1 and my GPU is A6000 (48G). It shows
CUDA Out of Memory
. So, I followed the multi-GPU Training and examples. But it still showsCUDA Out of Memory
. Codes work well on A100 (80G) but fail on two A6000. May I ask how to use multi GPUs to train one GNN model for node regression on a large homogenous undirected graph?As I use Slurm cluster to train models, how to set up sbatch file? The example has two rows. Do I need to modify it based on my cluster?
Also do I need to modify these codes?
Thank you so much!
Beta Was this translation helpful? Give feedback.
All reactions