DataLoader generates subgraph contains only labeled nodes? #4520
Unanswered
JiaruiWang
asked this question in
Q&A
Replies: 1 comment 2 replies
-
The sampling in For example if you do
Then each batch will have the first 5 nodes from the |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Thank you for this amazing framework and all your hard work on this!
I have questions on the training subgraph from data loaders.
My task is a node classification problem on a 60M nodes graph. 20M nodes are labeled, and 40M nodes are unlabeled. I created the dataset with a 16M nodes training mask, a 2M nodes validation mask, and a 2M nodes testing mask, out of 20M labeled nodes.
If I feed the 16M training mask into NeighborLoader or GraphSAINTSampler, do they generate the subgraph contains only the labeled nodes? or the subgraph is generated from one labeled node and its neighbors which are both labeled and unlabeled Nodes?
If the subgraph contains only labeled nodes, then the node features and messages from the unlabeled nodes are missing in the training. What's the best practice for this?
Thank you very much
Beta Was this translation helpful? Give feedback.
All reactions