How to structure multiple sequences of graphs #3048
-
Hi, I have a dataset which consists of sequences of graphs which describe the state of different systems evolving over time. Let's assume that the edge structure is static and that the number of nodes for each sequence is constant, and that only the node features change from time step to time step. What would be a suitable way of handling the Data() objects and creating a Pytorch Geometric Dataset? A simple approach would be to create a Data() object for each graph for each time step, and then setting shuffle to False when creating the dataloader. This might lead to batching issues, however, at it seems inflexible. Alternatively, since I'm (currently at least) assuming a static edge structure, I can create a single Data() object for each sequence, and then simply have the time step as an additional dimension of the node attributes. What is the suggested approach to handling multiple sequences of graphs while retaining the flexibility of the Dataset and DataLoader classes? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
If your edge structure is static and only the node features evolve over time, you can save your node features as data.edge_index = [edge_index_1, edge_index_2, ...] There also exists PyTorch Geometric Temporal which might already fit your use-case. |
Beta Was this translation helpful? Give feedback.
If your edge structure is static and only the node features evolve over time, you can save your node features as
[num_nodes, num_timestamps, num_features]
insideData
(as you suggested). This should support mini-batching out-of-the box. For edge structures, the currentDataLoader
sadly cannot handle sequences, which I want to fix in an upcoming release. This should allow you to do the following:There also exists PyTorch Geometric Temporal which might already fit your use-case.