Loss Remains the same during training a GCN #2829

riskiem · 2021-07-07T14:14:21Z

riskiem
Jul 7, 2021

I am trying to run a GCN for node regression on a 1000 node network, which I created as follows:

labels = []
N = 1000 
nodes = range(0, N)
node_features = torch.randn(N,3)
edge_features = []
capital = torch.randn(N)

for node in nodes:
  if random.random() > 0.5:
    nb_nbrs = int(random.random() * (N/5))
    edge_features += [(random.random()+2)/3.] * nb_nbrs
    labels.append(1)
 else:
    nb_nbrs = int(random.random() * 10 + 1)
    edge_features += [random.random()] * nb_nbrs
    labels.append(0)`
nbrs = np.random.choice(nodes, size = nb_nbrs)
nbrs = nbrs.reshape((1, nb_nbrs))

node_edges = np.concatenate([np.ones((1, nb_nbrs), dtype = np.int32) * node, nbrs], axis = 0)

if node == 0:
    edges = node_edges
  else:
    edges = np.concatenate([edges, node_edges], axis = 1)

Next, I split the data into test/train:

data.train_mask = torch.zeros(data.num_nodes, dtype=torch.uint8)
data.train_mask[:int(0.8 * data.num_nodes)] = 1 #train only on the 80% nodes
data.test_mask = torch.zeros(data.num_nodes, dtype=torch.uint8) #test on 20 % nodes 
data.test_mask[- int(0.2 * data.num_nodes):] = 1

I write a very basic code for the network using an example on the Pytorch website, modifying to make it work for regression instead of classification:

from torch_geometric.nn import GCNConv
class GCN(torch.nn.Module):
    def __init__(self, hidden_channels):
        super(GCN, self).__init__()
        torch.manual_seed(12345)
        self.conv1 = GCNConv(data.num_features, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, 1)`

`    def forward(self, out, edge_index):
        out = self.conv1(out, edge_index)
        out = out.relu()
        out = F.dropout(out, p=0.5, training=self.training)
        out = self.conv2(out, edge_index)
        return out

Next, I define the function for training:

model = GCN(hidden_channels=2)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)
criterion = torch.nn.MSELoss()

def train():
      model.train()
      optimizer.zero_grad()  # Clear gradients.
      out = model(data.x, data.edge_index)  # Perform a single forward pass.
      loss = criterion(out[data.train_mask], data.y[data.train_mask])  # Compute the loss solely based on the training nodes.
      loss.backward()  # Derive gradients.
      optimizer.step()  # Update parameters based on gradients.
      return loss

Finally I train the model:

for epoch in range(1, 500):
    loss = train()
    print(f'Epoch: {epoch:03d}, Loss: {loss:.4f}')

When I train the model, I see that the loss roughly remains the same. Any idea why this would happen?
Epoch: 001, Loss: 0.9881
Epoch: 002, Loss: 0.9870
Epoch: 003, Loss: 0.9877
Epoch: 004, Loss: 0.9877
Epoch: 005, Loss: 0.9873
Epoch: 006, Loss: 0.9869
Epoch: 007, Loss: 0.9870
Epoch: 008, Loss: 0.9872
Epoch: 009, Loss: 0.9873
Epoch: 010, Loss: 0.9872
Epoch: 011, Loss: 0.9870
Epoch: 012, Loss: 0.9869
Epoch: 013, Loss: 0.9869
Epoch: 014, Loss: 0.9870
Epoch: 015, Loss: 0.9870
Epoch: 016, Loss: 0.9870
Epoch: 017, Loss: 0.9870
Epoch: 018, Loss: 0.9869
Epoch: 019, Loss: 0.9869
Epoch: 020, Loss: 0.9869
Epoch: 021, Loss: 0.9869
Epoch: 022, Loss: 0.9869
Epoch: 023, Loss: 0.9869
Epoch: 024, Loss: 0.9869
Epoch: 025, Loss: 0.9869
Epoch: 026, Loss: 0.9869
Epoch: 027, Loss: 0.9869
Epoch: 028, Loss: 0.9869
Epoch: 029, Loss: 0.9869
Epoch: 030, Loss: 0.9869

I do see some warnings like:

> /usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py:528: UserWarning: Using a target size (torch.Size([800])) that is different to the input size (torch.Size([800, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

I have been stuck on this for a while. Any help would be greatly appreciated!

Answered by rusty1s

Jul 8, 2021

I think the GCNConv cannot learn the given task, as it solely seems dependent on the number of neighbors of a given node. The GCNConv will perform mean aggregation, so it does not have knowledge about the underlying number of neighbors. You can try out a different GNN op using add aggregation, e.g.:

GraphConv(in_channels, out_channels, aggr='add')

Furthermore, the warning indicates that your shapes of inputs to the loss function do not match. This should be fixable by running:

loss = criterion(out[data.train_mask].squeeze(), data.y[data.train_mask].squeeze())

View full answer

rusty1s · 2021-07-08T06:07:37Z

rusty1s
Jul 8, 2021
Maintainer

I think the GCNConv cannot learn the given task, as it solely seems dependent on the number of neighbors of a given node. The GCNConv will perform mean aggregation, so it does not have knowledge about the underlying number of neighbors. You can try out a different GNN op using add aggregation, e.g.:

GraphConv(in_channels, out_channels, aggr='add')

Furthermore, the warning indicates that your shapes of inputs to the loss function do not match. This should be fixable by running:

loss = criterion(out[data.train_mask].squeeze(), data.y[data.train_mask].squeeze())

4 replies

riskiem Jul 8, 2021
Author

That worked. Thank you so much!

I have another classification. I wrote a test function. And I am using R_square as an evaluation metric. The following is my code:

from pytorch_lightning.metrics.functional import r2score
 def test():
      model.eval()
      out = model(data.x, data.edge_index)
      test_r_sq = r2score(data.y[data.test_mask].squeeze(), out[data.test_mask].squeeze())      
      return test_r_sq

test_acc_gcn = test()
print(f'Test Accuracy: {test_acc_gcn:.4f}')

This works and produces a result. But I am wondering if this is the right approach.

Thanks!

rusty1s Jul 9, 2021
Maintainer

This looks correct to me.

riskiem Jul 9, 2021
Author

Thank you!

yanyang-sysu Dec 15, 2021

谢谢！

Hello, is there a complete code? I would be grateful if you could send me a copy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Loss Remains the same during training a GCN #2829

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Loss Remains the same during training a GCN #2829

Uh oh!

Uh oh!

riskiem Jul 7, 2021

Replies: 1 comment · 4 replies

Uh oh!

rusty1s Jul 8, 2021 Maintainer

Uh oh!

Uh oh!

riskiem Jul 8, 2021 Author

Uh oh!

rusty1s Jul 9, 2021 Maintainer

Uh oh!

riskiem Jul 9, 2021 Author

Uh oh!

yanyang-sysu Dec 15, 2021

riskiem
Jul 7, 2021

Replies: 1 comment 4 replies

rusty1s
Jul 8, 2021
Maintainer

riskiem Jul 8, 2021
Author

rusty1s Jul 9, 2021
Maintainer

riskiem Jul 9, 2021
Author