Problems creating my own dataset & loading it (for graph-classification) #6074
Unanswered
elisagdelope
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm working on a graph classification problem, so I wanted to create a Dataset object from a list (e.g. data_list) of 578 Data() objects. This works just fine when I save the Dataset as
torch.save(data_list, self.processed_paths[0])
in theprocess()
method. It is saved and it loads as a list of Data() objects that I'm able to access without problems.However, pyg documentation suggests collating the data_list (the list of Data() objects) and then save the data and slices.
When I do this, then my dataset has length 1 (which is unexpected, as I would expect it to contain the 578 Data() objects). Surprisingly, dataset[0] is a Data() object that contains the information from all my 578 Data() objects. How do I get to access my individual Data() objects then?
The ideal outcome when loading the dataset would be a dataset object containing 578 Data() objects, each representing a graph that I can access. I think I am not understanding the functioning of slices. I don't even know where they are stored or how to get them from the dataset object...
This is how I create the dataset:
`import os.path as osp
from torch_geometric.data import Dataset, InMemoryDataset
class MyOmicsDataset(InMemoryDataset):
`
And this is how I call the function:
dataset = MyOmicsDataset(root="../data", X_file="rnaseq.csv", graph_file="ppi_score.csv", labels_file="labels.csv")
Beta Was this translation helpful? Give feedback.
All reactions