Replies: 1 comment 14 replies
-
I do not recommend using class MyDataset(InMemoryDataset):
def __init__(self, root, raw_filename, transform)
self.raw_filename = raw_filename
super().__init__(root, transform)
self.data, self.slices = torch.load(self.processed_paths[0])
@property
def processed_file_names(self):
return 'data.pt'
def process(self):
data_list = [] # Read raw files and create data list out of them.
torch.save(self.collate(data_list), self.processed_paths[0]) |
Beta Was this translation helpful? Give feedback.
14 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to create my own graph datasets for a GNN model using PyG.
When I load
citeseer
dataset, I can see the information asData(edge_index=[2, 9104], test_mask=[3327], train_mask=[3327], val_mask=[3327], x=[3327, 3703], y=[3327])
. And there wereprocessed
andraw
folders that contain several files. e.g.ind.citeseer.allx
,ind.citeseer.graph
, and so on.Suppose that I would like to make graph datasets, with 100 nodes and 3 features at each node. I have feature and node connection information in the form of
numpy array
which isnpz
file.How can I efficiently make my own datasets that are compatible with PyTorch Geometric? Is it possible to make the graph datasets using
networkx
, and do you recommend it? Then how can I save the graph datasets such that we could use.npz
file for image data?Beta Was this translation helpful? Give feedback.
All reactions