Testing on a huge graph. #4538
Unanswered
JiaruiWang
asked this question in
Q&A
Replies: 1 comment
-
You are right that one cannot do full-batch inference on CPU on such a giant graph. The alternative is to also use neighbor sampling during inference. To reduce the variance due to sampling/dropout of edges, you can either try to sample all neighbors in the inference loader ( |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
My task is a node classification problem on a 60M nodes graph. 20M nodes are labeled, and 40M nodes are unlabeled. I created the dataset with a 16M nodes training mask, a 2M nodes validation mask, and a 2M nodes testing mask, out of 20M labeled nodes.
I feed the 16M training nodes into NeighborLoader to generate the data loader for training. Sampled subgraphs can fit into the GPU memory.
In the evaluation process, all the examples pass the whole graph into the model, then get the output[test_mask]. This will cause the GPU OOM. If compute the evaluation process on the CPU, it is too slow.
What's the best practice for it?
Thank you very much
Beta Was this translation helpful? Give feedback.
All reactions