How to do mini-batching on the fly without the PyG dataloader? #4867
-
Suppose I am not using the PyG datasets/dataloader class and I just have a regular dataloader with graphs stored as adjacency matrices. What I would like to accomplish is the following. Suppose I have these two batched sets of features/matrices
Where I am training with batch_size=32. Can I, on the fly, batch these so I can then simply pass to GCNConv? I tried simply converting the dense_adj_mats with dense_to_sparse() but this doesnt seem to work properly. I feel something must also be done with node_feats. Note: The reason I don't want to use the PyG datasets class is this is largely an NLP problem with lots of other features I need to batch in a certain way. I obtain node_feats after lots of text transformations. Thus, the pipeline is already in place -- hoping to just drop in GCNConv without restructuring everything. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Yes, you can use |
Beta Was this translation helpful? Give feedback.
Yes, you can use
dense_to_sparse
for the adjacency matrix. For node features, simply reshaping to(batch_size * num_nodes, num_features)
is sufficient.