InMemoryDataset to create training set and test set respectively. #9434
-
Hi, I am using InMemoryDataset to create my own datasets for graph classification. Generally, a dataset is created through InMemoryDataset and then the training set and test set are split by a train ratio or slicing. My code is as shown below:
Now I need to specify the nodes for training set and test set. So I manually prepared the corresponding csv files for training set and test set. That is: Now I need to create the training set and test set using InMemoryDataset, respectively. I suppose the simplest way is to define two InMemoryDataset classes (one for training set and another for test set). But I believe it is a stupid method and I think there should be more elegant ways. Any suggestions are appreciated🙏 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Why can't you simply do post-filtering and only keep one single dataset as you do in your example? I don't think you need to have two |
Beta Was this translation helpful? Give feedback.
Why can't you simply do post-filtering and only keep one single dataset as you do in your example? I don't think you need to have two
InMemoryDataset
variants. Alternatively, you can specify asplit
attribute as part of your dataset class, and then load in the correct data. This is what we commonly do as well, e.g., here: https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/datasets/gdelt.py