Validating on graphs with absent edge types #4566

e-mauss · 2022-04-29T12:11:50Z

e-mauss
Apr 29, 2022

Hello everyone :)

over the course of the past week i've been chasing after an error in my machine-learning process and i'm starting to realise that i could actually be dealing with a fundamental problem that might not be solvable by using GNNs.

The problem I'm facing is that the graphs (HeteroData objects) im training and validating on do not necessarily share the same metadata. They do however share the same superset of possible node types and edge types, their actual metadata however will only be a random subset of that.

I thought that maybe training my model on graphs that have metadata equal to the full set of possible node and edge types might solve my problem. But as soon as i try validating on a graph that only has a subset of those types as metadata i will get the following error:

Traceback (most recent call last):
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 281, in <module>
    main()
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 71, in main
    train_stats = train(model, train_loader, val_loader, loss_fn, optimizer, scheduler, config)
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 179, in train
    val_stats = evaluate(model, val_loader, loss_fn, config, epoch)
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 223, in evaluate
    preds = model(batch.x_dict, batch.edge_index_dict, batch.edge_attr_dict)
  File "D:\Users\emauss\git\environment_indigo_v19_pytorch\lib\site-packages\torch\fx\graph_module.py", line 616, in wrapped_call
    raise e.with_traceback(None)
TypeError: linear(): argument 'input' (position 1) must be Tensor, not NoneType

I assume that what's happening here is that the model expects input for a certain edge type but there is no entry in one of the dicts i'm passing to the model, so it just receives None as input.

I could also try always defining the full superset of possible node types and edge types as metadata on all graphs and leaving empty tensors for unused edges/nodes. I'm not sure if that would work though as i have already encountered some error when one node attribute tensor happened to be empy.

Traceback (most recent call last):
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 281, in <module>
    main()
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 71, in main
    train_stats = train(model, train_loader, val_loader, loss_fn, optimizer, scheduler, config)
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 179, in train
    val_stats = evaluate(model, val_loader, loss_fn, config, epoch)
  File "D:/Users/emauss/git/ai4grids_jasmin/train_model.py", line 223, in evaluate
    preds = model(batch.x_dict, batch.edge_index_dict, batch.edge_attr_dict)
  File "D:\Users\emauss\git\environment_indigo_v19_pytorch\lib\site-packages\torch\fx\graph_module.py", line 616, in wrapped_call
    raise e.with_traceback(None)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x0 and 1x6)

In this example the HeteroData object holding the validation graph was keeping track of a certain node type, the node store for that node type however was just an empty tensor.
{'x': tensor([]), 'y': tensor([], size=(0, 2)), 'batch': tensor([], dtype=torch.int64), 'ptr': tensor([0, 0])}

For my model, I used the to_hetero() function to convert it to a model that can handle heterogenous data. Also everything seems to be working fine as long as the two graphs share the same metadata. Am i missing something, am i doing something wrong or am i just hitting a dead end here?

If there are any code snippets or information about my HeteroData objects i should supply please let me know, i'll do my best to provide whatever you need as fast as possible.

I appreciate the help and thank you in advance. :)
Erik

Answered by rusty1s

Apr 29, 2022

The workaround of filling your data with empty node and edge types is the approach I would recommend for this. We are also working on batching with dynamic node and edge types, but this is still WIP.

Note that you will need to define your tensors with correct shape for this to work, e.g.,

data[node_type].x = torch.empty((0, num_features)
data[edge_type].edge_index = torch.empty((2, 0), dtype=torch.long)

Note that shapes need to match except for the node and edge dimension.

View full answer

rusty1s · 2022-04-29T13:13:33Z

rusty1s
Apr 29, 2022
Maintainer

The workaround of filling your data with empty node and edge types is the approach I would recommend for this. We are also working on batching with dynamic node and edge types, but this is still WIP.

Note that you will need to define your tensors with correct shape for this to work, e.g.,

data[node_type].x = torch.empty((0, num_features)
data[edge_type].edge_index = torch.empty((2, 0), dtype=torch.long)

Note that shapes need to match except for the node and edge dimension.

7 replies

e-mauss May 4, 2022
Author

For now i have fallen back to adding dummy nodes so that i won't have any empty node attribute tensors going into my model. It fixes the training and validation process, however this is not an optimal solution as it interferes with how i'm processing and plotting my models predictions.

Again, i feel like i'm missing something very obvious. Below is the model i'm using. Maybe something is wrong there? As I said i'm using the to_hetero() method to prepare it for my HeteroData batches.

The model i'm using

from torch import nn as tnn
from torch_geometric.nn import Linear, GENConv


class PowerflowNet(tnn.Module):
    def __init__(self, hidden_channels, out_channels, n_layers, norm):
        super(PowerflowNet, self).__init__()
        self.node_encoder = Linear(-1, hidden_channels)
        self.edge_encoder = Linear(-1, hidden_channels)
        self.lin = Linear(hidden_channels, out_channels)
        self.n_layers = n_layers

        self.convs = tnn.ModuleList()
        self.acts = tnn.ModuleList()
        self.norms = tnn.ModuleList()
        for layer in range(n_layers):
            self.convs.append(GENConv(hidden_channels, hidden_channels, aggr='add', norm=norm))
            self.acts.append(tnn.ReLU(inplace=True))
            self.norms.append(tnn.LayerNorm(hidden_channels, elementwise_affine=True))

    def forward(self, x, edge_index, edge_attr):

        n_layers = self.n_layers

        x = self.node_encoder(x)
        edge_attr = self.edge_encoder(edge_attr)

        for layer in range(n_layers):
            x = self.convs[layer](x, edge_index, edge_attr)
            x = self.norms[layer](x)
            x = self.acts[layer](x)

        x = self.lin(x)

        return x

I'd still appreciate any help i can get. :)
If there are any code snippets or further information you need just let me know, and i'll be happy to provide the information as fast as possible.
Erik

rusty1s May 5, 2022
Maintainer

Do you have a minimal example to reproduce? I'm happy to look into this.

e-mauss May 9, 2022
Author

Hey, i've created an example for you.

from itertools import product
import torch
from torch import nn as tnn
from torch_geometric.data import HeteroData
from torch_geometric.loader import DataLoader
from torch_geometric.nn import Linear, GENConv, to_hetero


def build_data(create_working=False):
    hetero_data = HeteroData()
    hetero_data['slack_bus'].x = torch.tensor([[0.0122, 0.0072]])
    hetero_data['measured_bus'].x = torch.tensor([[-2.0514e-03, -1.2526e-03, 0.0000e+00, 0.0000e+00],
                                                  [-6.4634e-04, -4.2826e-04, 0.0000e+00, 0.0000e+00],
                                                  [-1.2927e-03, -8.5653e-04, 0.0000e+00, 0.0000e+00],
                                                  [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
                                                  [-2.0514e-03, -1.2526e-03, 0.0000e+00, 0.0000e+00],
                                                  [-4.8476e-04, -3.2126e-04, 0.0000e+00, 0.0000e+00],
                                                  [-4.3958e-04, -2.6849e-04, 0.0000e+00, 0.0000e+00],
                                                  [-1.2310e-04, -4.1045e-05, 0.0000e+00, 0.0000e+00],
                                                  [-5.8610e-04, -3.5792e-04, 0.0000e+00, 0.0000e+00],
                                                  [-8.7916e-04, -5.3676e-04, 0.0000e+00, 0.0000e+00],
                                                  [-1.8539e-04, 1.6844e-05, 0.0000e+00, 0.0000e+00],
                                                  [-1.9390e-03, -1.2848e-03, 0.0000e+00, 0.0000e+00],
                                                  [-2.1008e-04, -2.7995e-05, 0.0000e+00, 0.0000e+00],
                                                  [-8.0793e-04, -5.3526e-04, 0.0000e+00, 0.0000e+00]])
    if not create_working:
        hetero_data['unmeasured_bus'].x = torch.zeros([0, 1])
    else:
        hetero_data['unmeasured_bus'].x = torch.zeros([1, 1])

    node_types = ['slack_bus', 'measured_bus', 'unmeasured_bus']
    connection_types = ['transformer', 'line']
    for n1, e, n2 in product(node_types, connection_types, node_types):
        hetero_data[n1, e, n2].edge_index = []
        hetero_data[n1, e, n2].edge_attr = []

    hetero_data[('measured_bus', 'transformer', 'slack_bus')].edge_index = [[3], [0]]
    hetero_data[('slack_bus', 'transformer', 'measured_bus')].edge_index = [[0], [3]]
    hetero_data[('measured_bus', 'line', 'measured_bus')].edge_index = [
        [9, 2, 13, 11, 6, 3, 8, 1, 7, 10, 10, 9, 3, 7, 11, 6, 5, 13,
         3, 0, 1, 3, 12, 8, 4, 5],
        [2, 9, 11, 13, 3, 6, 1, 8, 10, 7, 9, 10, 7, 3, 6, 11, 13, 5,
         0, 3, 3, 1, 8, 12, 5, 4]]

    hetero_data[('measured_bus', 'transformer', 'slack_bus')].edge_attr = [[2.3254e-01, 9.1792e-02, 4.5999e-04, -3.8489e-06]]
    hetero_data['slack_bus', 'transformer', 'measured_bus'].edge_attr = [[2.3254e-01, 9.1792e-02, 4.5999e-04, -3.8489e-06]]
    hetero_data[('measured_bus', 'line', 'measured_bus')].edge_attr = [[2.8031e-02, 7.2044e-02, 0.0000e+00, 2.3266e-06],
                                                                       [2.8031e-02, 7.2044e-02, 0.0000e+00, 2.3266e-06],
                                                                       [2.6930e-02, 6.9214e-02, 0.0000e+00, 2.2352e-06],
                                                                       [2.6930e-02, 6.9214e-02, 0.0000e+00, 2.2352e-06],
                                                                       [2.5040e-02, 6.4354e-02, 0.0000e+00, 2.0783e-06],
                                                                       [2.5040e-02, 6.4354e-02, 0.0000e+00, 2.0783e-06],
                                                                       [8.9905e-03, 2.3107e-02, 0.0000e+00, 7.4621e-07],
                                                                       [8.9905e-03, 2.3107e-02, 0.0000e+00, 7.4621e-07],
                                                                       [8.0873e-03, 2.0785e-02, 0.0000e+00, 6.7125e-07],
                                                                       [8.0873e-03, 2.0785e-02, 0.0000e+00, 6.7125e-07],
                                                                       [1.2449e-02, 3.1995e-02, 0.0000e+00, 1.0332e-06],
                                                                       [1.2449e-02, 3.1995e-02, 0.0000e+00, 1.0332e-06],
                                                                       [2.5842e-03, 6.6417e-03, 0.0000e+00, 2.1449e-07],
                                                                       [2.5842e-03, 6.6417e-03, 0.0000e+00, 2.1449e-07],
                                                                       [1.0792e-03, 2.7736e-03, 0.0000e+00, 8.9571e-08],
                                                                       [1.0792e-03, 2.7736e-03, 0.0000e+00, 8.9571e-08],
                                                                       [6.8972e-02, 1.7726e-01, 0.0000e+00, 5.7247e-06],
                                                                       [6.8972e-02, 1.7726e-01, 0.0000e+00, 5.7247e-06],
                                                                       [6.6601e-02, 1.7117e-01, 0.0000e+00, 5.5279e-06],
                                                                       [6.6601e-02, 1.7117e-01, 0.0000e+00, 5.5279e-06],
                                                                       [8.1466e-03, 2.0938e-02, 0.0000e+00, 6.7617e-07],
                                                                       [8.1466e-03, 2.0938e-02, 0.0000e+00, 6.7617e-07],
                                                                       [2.3130e-02, 5.9446e-02, 0.0000e+00, 1.9198e-06],
                                                                       [2.3130e-02, 5.9446e-02, 0.0000e+00, 1.9198e-06],
                                                                       [1.2981e-03, 3.3364e-03, 0.0000e+00, 1.0775e-07],
                                                                       [1.2981e-03, 3.3364e-03, 0.0000e+00, 1.0775e-07]]
    for edge in hetero_data.edge_types:
        idx = hetero_data[edge].edge_index
        attr = hetero_data[edge].edge_attr
        if len(idx) == 0:
            hetero_data[edge].edge_index = torch.empty((2, 0), dtype=torch.long)
        else:
            hetero_data[edge].edge_index = torch.LongTensor(idx)

        if len(attr) == 0:
            hetero_data[edge].edge_attr = torch.empty((0, 4), dtype=torch.float)
        else:
            hetero_data[edge].edge_attr = torch.FloatTensor(attr)

    return hetero_data


class PowerflowNet(tnn.Module):
    def __init__(self, hidden_channels, out_channels, n_layers, norm):
        super(PowerflowNet, self).__init__()
        self.node_encoder = Linear(-1, hidden_channels)
        self.edge_encoder = Linear(-1, hidden_channels)
        self.lin = Linear(hidden_channels, out_channels)
        self.n_layers = n_layers

        self.convs = tnn.ModuleList()
        self.acts = tnn.ModuleList()
        self.norms = tnn.ModuleList()
        for layer in range(n_layers):
            self.convs.append(GENConv(hidden_channels, hidden_channels, aggr='add', norm=norm))
            self.acts.append(tnn.ReLU(inplace=True))
            self.norms.append(tnn.LayerNorm(hidden_channels, elementwise_affine=True))

    def forward(self, x, edge_index, edge_attr):

        n_layers = self.n_layers

        x = self.node_encoder(x)
        edge_attr = self.edge_encoder(edge_attr)

        for layer in range(n_layers):
            x = self.convs[layer](x, edge_index, edge_attr)
            x = self.norms[layer](x)
            x = self.acts[layer](x)

        x = self.lin(x)

        return x


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

data = build_data()

model = PowerflowNet(12, 2, 8, norm='layer')
model = to_hetero(model, data.metadata(), aggr='sum').to(device)

loader = DataLoader([data])

batch = next(iter(loader)).to(device)
model(batch.x_dict, batch.edge_index_dict, batch.edge_attr_dict)

print('done')

This should create an error message:

Traceback (most recent call last):
  File "D:/Users/emauss/git/gnn_se/example.py", line 132, in <module>
    model(batch.x_dict, batch.edge_index_dict, batch.edge_attr_dict)
  File "D:\Users\emauss\git\environment_indigo_v19_pytorch\lib\site-packages\torch\fx\graph_module.py", line 616, in wrapped_call
    raise e.with_traceback(None)
RuntimeError: output with shape [1, 12] doesn't match the broadcast shape [0, 12]

This is not the same error message i kept receiving earier (The size of tensor a (<batch_size>) must match the size of tensor b (0) at non-singleton dimension 0.) however i was not able to really reproduce it and i'm not sure why. But it does seem to me like both errors describe and originate from the same problem: an empty node attribute tensor.

If you call data = build_data() with create_working=True it should create a working example.

If you need anything more feel free to let me know.

rusty1s May 11, 2022
Maintainer

The code runs through for me. Can you check that you are on latest PyG?

pip install --upgrade torch-geometric

e-mauss May 17, 2022
Author

It seems to be working with the latest version. Thank you for helping me out! ^^

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Validating on graphs with absent edge types #4566

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 7 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Validating on graphs with absent edge types #4566

Uh oh!

e-mauss Apr 29, 2022

Replies: 1 comment · 7 replies

Uh oh!

rusty1s Apr 29, 2022 Maintainer

Uh oh!

e-mauss May 4, 2022 Author

Uh oh!

rusty1s May 5, 2022 Maintainer

Uh oh!

e-mauss May 9, 2022 Author

Uh oh!

rusty1s May 11, 2022 Maintainer

Uh oh!

e-mauss May 17, 2022 Author

e-mauss
Apr 29, 2022

Replies: 1 comment 7 replies

rusty1s
Apr 29, 2022
Maintainer

e-mauss May 4, 2022
Author

rusty1s May 5, 2022
Maintainer

e-mauss May 9, 2022
Author

rusty1s May 11, 2022
Maintainer

e-mauss May 17, 2022
Author