-
Hello everyone :) over the course of the past week i've been chasing after an error in my machine-learning process and i'm starting to realise that i could actually be dealing with a fundamental problem that might not be solvable by using GNNs. The problem I'm facing is that the graphs (HeteroData objects) im training and validating on do not necessarily share the same metadata. They do however share the same superset of possible node types and edge types, their actual metadata however will only be a random subset of that. I thought that maybe training my model on graphs that have metadata equal to the full set of possible node and edge types might solve my problem. But as soon as i try validating on a graph that only has a subset of those types as metadata i will get the following error:
I assume that what's happening here is that the model expects input for a certain edge type but there is no entry in one of the dicts i'm passing to the model, so it just receives None as input. I could also try always defining the full superset of possible node types and edge types as metadata on all graphs and leaving empty tensors for unused edges/nodes. I'm not sure if that would work though as i have already encountered some error when one node attribute tensor happened to be empy.
In this example the HeteroData object holding the validation graph was keeping track of a certain node type, the node store for that node type however was just an empty tensor. For my model, I used the to_hetero() function to convert it to a model that can handle heterogenous data. Also everything seems to be working fine as long as the two graphs share the same metadata. Am i missing something, am i doing something wrong or am i just hitting a dead end here? If there are any code snippets or information about my HeteroData objects i should supply please let me know, i'll do my best to provide whatever you need as fast as possible. I appreciate the help and thank you in advance. :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
The workaround of filling your data with empty node and edge types is the approach I would recommend for this. We are also working on batching with dynamic node and edge types, but this is still WIP. Note that you will need to define your tensors with correct shape for this to work, e.g., data[node_type].x = torch.empty((0, num_features)
data[edge_type].edge_index = torch.empty((2, 0), dtype=torch.long) Note that shapes need to match except for the node and edge dimension. |
Beta Was this translation helpful? Give feedback.
The workaround of filling your data with empty node and edge types is the approach I would recommend for this. We are also working on batching with dynamic node and edge types, but this is still WIP.
Note that you will need to define your tensors with correct shape for this to work, e.g.,
Note that shapes need to match except for the node and edge dimension.