How to handle multiple StaticHeteroGraphs? #7206
Unanswered
MicheleSodano
asked this question in
Q&A
Replies: 1 comment
-
I don't have any experience with PyTorch Geometric Temporal, so I am not really confident in given a good answer here. Your solution sounds solid, and
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have constructed several StaticHeteroGraphs using Networkx, which I then converted into HeteroData objects using StaticHeteroGraphTemporalSignal from torch_geometric_temporal. Each HeteroData represents a snapshot of the StaticHeteroGraph at each time-step. In each StaticHeteroGraph, nodes correspond to specific components of a building (such as Walls, Windows, Floors, and Rooms), each with a different number of features. While the number of nodes can vary between buildings, the node types remain constant. To train all the StaticHeteroGraphs together, I have created "dummy nodes" with zero-padding features (of the size of each node type) to account for missing nodes. Essentially, all StaticHeteroGraphs are subgraphs of the same "big Graph" with all possible nodes.
The edge_index_dict may vary from one StaticHeteroGraph to another, but it is the same for all snapshots within a specific StaticHeteroGraph. Additionally, some edges have attributes, so each StaticHeteroGraph has an edge_attr_dict that defines the attributes for specific edge types in the edge_index_dict, which may also be empty.
My goal is to predict the target value of only one node type in the x_dict (in this case, the target of the "Room" component). I want to use GNN methods, like SAGEConv, to generate the embedding of nodes in each snapshot, concatenate the node embeddings from all snapshots, and use a 1D ResNet to predict the target value of a specific node type. I aim to use HeteroConv to apply two SAGEConv layers to generate node embeddings in each snapshot, considering both edge_index_dict and edge_attr_dict.
The issue is that, as previously mentioned, edge_attr_dict specifies attributes only for specific types of edges. Additionally, I want to exclude "dummy nodes" when updating features so that the result after the two SAGEConv layers is a dictionary that gives a tensor of shape (n_nodes, updated_features) for node_type['Room']. The n_nodes are the nodes with input features in the x_dict that differ from just zeros, and updated_features are the out_channels for the Room node_type.
I have managed to achieve this for a single StaticHeteroGraph by creating a mask_dict to remove "dummy" nodes from the x_dict, and creating a new_edge_index_dict based on a new index_mapping. However, looping through each snapshot is slow due to the number of snapshots (8760 time-steps) in each StaticHeteroGraph, and the total number of StaticHeteroGraphs is 4000.
I have two questions:
(1) How can I create a dataloader that generates batches to train over multiple snapshots and multiple StaticHeteroGraphs?
(2) I am considering whether I can train the model over several StaticHeteroGraphs with varying numbers of nodes, but with the same edge_type. Is this possible, or do I need to maintain the same structure? Can I apply masking and re-indexing while creating the StaticHeteroGraphs without adding many "dummy nodes"?
Beta Was this translation helpful? Give feedback.
All reactions