GraphSAINT dataloader in Pytorch Lightning #2499
-
Hi all, First of all thanks for the amazing graph learning framework. I am trying to wrap a GraphSAINT randomwalk sampler + the Flickr dataset in pytorch lightning to enable smooth multi-GPU support. I followed the pytorch lightning convention with seperate train/val/test dataloaders (graphsaint randomwalk) but unlike the NeighborSampler used in the pytorch lightning examples, this dataloader does not support node_idx to specify the mask and thus split. My intuition would be to split up the data object into train, validation and test data objects. Would this approach be correct and if so, are there any utils I have overlooked that streamline this splitting? Additionally, I was wondering if there is a specific reason the NeighborSampler is favoured over the GraphSAINT sampler in the node property prediction examples on the OGB repo. Thanks a lot! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You can split your |
Beta Was this translation helpful? Give feedback.
You can split your
data
into inductive training, validation, and test (sub)-graphs viatorch_geometric.utils.subgraph
directly in theprepare_data
method and before initializing the GraphSAINTSampler. However, please note that GraphSAINT can only make use of sampling during training, in particular because nodes may be sampled more than once during a single epoch. For validation and testing, it's therefore best to operate on the complete graph.