Replicating custom gcn based on pytorch documentation gives different results than built-in GCNConv #4680

kimzed · 2022-05-19T10:33:43Z

kimzed
May 19, 2022

I am trying to test the costum convolutional layer presented in the torch geometric documentation:

I made two similar models, one using the GCNConv from pytorch, the other the custom conv based on their documentation. Im testing these two models with the GNNBenchmarkDataset 'PATTERN' (node level prediction). The built-in version works well on the validation set, but the custom one does not work at all (no positives predicted). the settings for the two tests are absolutely similar (just copied the code and changed the model name). Am i missing something here?

As an additional information, I checked and the models have the same number of parameters.

built_in model:

class Gcn(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        # Init parent
        super(Gcn, self).__init__()
        torch.manual_seed(42)
        self.learning_rate = 0.01

        # GCN layers
        self.initial_conv = GCNConv(input_dim, hidden_dim)
        self.conv1 = GCNConv(hidden_dim, hidden_dim)
        self.conv2 = GCNConv(hidden_dim, hidden_dim)
        self.conv3 = GCNConv(hidden_dim, hidden_dim)

        # Output layer
        self.out = Linear(hidden_dim, output_dim)

    def forward(self, data):
        x, edge_index, batch_index = data.x, data.edge_index, data.batch
        # First Conv layer
        hidden = self.initial_conv(x, edge_index)
        hidden = torch.tanh(hidden)

        # Other Conv layers
        hidden = self.conv1(hidden, edge_index)
        hidden = torch.tanh(hidden)
        hidden = self.conv2(hidden, edge_index)
        hidden = torch.tanh(hidden)
        hidden = self.conv3(hidden, edge_index)
        hidden = torch.tanh(hidden)

        # Apply a final (linear) classifier.
        out = self.out(hidden)

        return out

custom model:

class Gcn(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        # Init parent
        super(Gcn, self).__init__()
        torch.manual_seed(42)
        self.learning_rate = 0.01

        # GCN layers
        self.initial_conv = GcnConv(input_dim, hidden_dim)
        self.conv1 = GcnConv(hidden_dim, hidden_dim)
        self.conv2 = GcnConv(hidden_dim, hidden_dim)
        self.conv3 = GcnConv(hidden_dim, hidden_dim)

        # Output layer
        self.out = Linear(hidden_dim, output_dim)

    def forward(self, data):
        x, edge_index, batch_index = data.x, data.edge_index, data.batch
        # First Conv layer
        hidden = self.initial_conv(x, edge_index)
        hidden = torch.tanh(hidden)

        # Other Conv layers
        hidden = self.conv1(hidden, edge_index)
        hidden = torch.tanh(hidden)
        hidden = self.conv2(hidden, edge_index)
        hidden = torch.tanh(hidden)
        hidden = self.conv3(hidden, edge_index)
        hidden = torch.tanh(hidden)

        # Apply a final (linear) classifier.
        out = self.out(hidden)

        return out


class GcnConv(MessagePassing):
    """
    using this website: https://pytorch-geometric.readthedocs.io/en/latest/notes/create_gnn.html
    """

    def __init__(self, in_channels, out_channels):
        super().__init__(aggr='add')  # "Add" aggregation (Step 5).
        self.lin = torch.nn.Linear(in_channels, out_channels)

    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        # Step 1: Add self-loops to the adjacency matrix.
        edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

        # Step 2: Linearly transform node feature matrix.
        x = self.lin(x)

        # Step 3: Compute normalization.
        row, col = edge_index
        deg = degree(col, x.size(0), dtype=x.dtype)
        deg_inv_sqrt = deg.pow(-0.5)
        deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
        norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]

        # Step 4-5: Start propagating messages.
        return self.propagate(edge_index, x=x, norm=norm)

    def message(self, x_j, norm):
        # x_j has shape [E, out_channels]

        # Step 4: Normalize node features.
        return norm.view(-1, 1) * x_j

EDIT:
so I went a bit inside what is happening in the torch geometric library for the gcn conv. it appears that the linear layer before the propagate function uses from torch_geometric.nn.dense.linear import Linear without biases. also a trainable bias parameter is created and added to the output of propagate. I find it very weird that the documentation talks about a complete different function. As an addition maybe i missed something but this is not part of the paper formula. I implemented my custom conv accordingly and I get similar results this time.

Knowing all this, the custom conv should look more like this:

from torch.nn import Parameter
from torch_geometric.nn.inits import zeros
from torch_geometric.nn.dense.linear import Linear

class GcnConv(MessagePassing):
    """
    using this website: https://pytorch-geometric.readthedocs.io/en/latest/notes/create_gnn.html
    """

    def __init__(self, in_channels, out_channels):
        super().__init__(aggr='add')  # "Add" aggregation (Step 5).
        self.lin = Linear(in_channels, out_channels, bias=False, weight_initializer='glorot')

        # following code from torch geometric, im adding my own biases that i will out outside of propagate
        self.bias = Parameter(torch.Tensor(out_channels))

        # initialization
        self.lin.reset_parameters()
        zeros(self.bias)


    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        # Step 1: Add self-loops to the adjacency matrix.
        edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

        # Step 2: Linearly transform node feature matrix.
        x = self.lin(x)

        # Step 3: Compute normalization.
        idxs_source, idxs_neighbours = edge_index
        deg = degree(idxs_neighbours, x.size(0), dtype=x.dtype)
        deg_inv_sqrt = deg.pow(-0.5)
        deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
        norm = deg_inv_sqrt[idxs_source] * deg_inv_sqrt[idxs_neighbours]

        # Step 4-5: Start propagating messages.
        out = self.propagate(edge_index, x=x, norm=norm)
        out += self.bias

        return out

    def message(self, x_j, norm):
        # x_j has shape [E, out_channels]

        # Step 4: Normalize node features.
        return norm.view(-1, 1) * x_j

Answered by rusty1s

May 19, 2022

Yes, it looks like applying the bias before-hand leads to worse results since it will get additionally aggregated across neighbors instead of being applied once. Do you mind fixing this in our documentation? :)

View full answer

rusty1s · 2022-05-19T22:26:28Z

rusty1s
May 19, 2022
Maintainer

Yes, it looks like applying the bias before-hand leads to worse results since it will get additionally aggregated across neighbors instead of being applied once. Do you mind fixing this in our documentation? :)

3 replies

rusty1s Jun 2, 2022
Maintainer

I finally found time to fix this in the documentation, see #4755.

kimzed Jun 5, 2022
Author

yes sorry i did not have time to fix it, i wanted to do it at the end of my research

rusty1s Jun 6, 2022
Maintainer

No worries :) Hope it was okay to fix it on my own.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replicating custom gcn based on pytorch documentation gives different results than built-in GCNConv #4680

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Replicating custom gcn based on pytorch documentation gives different results than built-in GCNConv #4680

Uh oh!

Uh oh!

kimzed May 19, 2022

Replies: 1 comment · 3 replies

Uh oh!

Uh oh!

rusty1s May 19, 2022 Maintainer

Uh oh!

rusty1s Jun 2, 2022 Maintainer

Uh oh!

kimzed Jun 5, 2022 Author

Uh oh!

rusty1s Jun 6, 2022 Maintainer

kimzed
May 19, 2022

Replies: 1 comment 3 replies

rusty1s
May 19, 2022
Maintainer

rusty1s Jun 2, 2022
Maintainer

kimzed Jun 5, 2022
Author

rusty1s Jun 6, 2022
Maintainer