Binary classification of nodes #3405

arpieb · 2021-10-29T01:23:58Z

arpieb
Oct 29, 2021

First off, well-documented library and a great addition to the PyTorch ecosystem, thanks for the effort!

I am admittedly rather new to GNNs, and am trying to build a model to perform binary classification per node. I have a dataset of ~4300 digraphs of varying size that are HTML DOMs, with at present a single node of interest labeled 1 in each graph, all the rest are 0. There is a single one-hot encoded feature of dim 16 on each node, and edge attributes are irrelevant (all a single weight of 1.0). The data shapes looks like this:

>>> data[:3]
[Data(edge_index=[2, 161], x=[160, 16], y=[160, 1]),
 Data(edge_index=[2, 124], x=[124, 16], y=[124, 1]),
 Data(edge_index=[2, 207], x=[203, 16], y=[203, 1])]

I based my model off the PPI example, but attempted to modify it for binary classification, keeping things simple:

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = GATConv(16, 1024)
        self.lin1 = Linear(16, 1024)
        self.conv2 = GATConv(1024, 1024)
        self.lin2 = Linear(1024, 1024)
        self.conv3 = GATConv(1024, 1)
        self.lin3 = Linear(1024, 1)

    def forward(self, x, edge_index):
        x = F.elu(self.conv1(x, edge_index) + self.lin1(x))
        x = F.elu(self.conv2(x, edge_index) + self.lin2(x))
        x = F.elu(self.conv3(x, edge_index) + self.lin3(x))
        return torch.sigmoid(x)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Net().to(device)
loss_op = F.binary_cross_entropy_with_logits
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)


def train():
    model.train()

    total_loss = 0
    for data in train_loader:
        data = data.to(device)
        optimizer.zero_grad()

        loss = loss_op(model(data.x, data.edge_index), data.y)
        total_loss += loss.item() * data.num_graphs
        loss.backward()
        optimizer.step()
    return total_loss / len(train_loader.dataset)


@torch.no_grad()
def test(loader):
    model.eval()

    ys, preds = [], []
    for data in loader:
        ys.append(data.y)
        out = model(data.x.to(device), data.edge_index.to(device))
        out = torch.argmax(out, dim=1)
        preds.append((out > 0).float().cpu())

    y, pred = torch.cat(ys, dim=0).numpy().squeeze(), torch.cat(preds, dim=0).numpy()
    print(y.sum(), pred.sum())
    return f1_score(y, pred, average='micro') if pred.sum() > 0 else 0


for epoch in trange(1, 4):
    loss = train()
    test_f1 = test(test_loader)
    print('Epoch: {:02d}, Loss: {:.4f}, Test: {:.4f}'.format(
        epoch, loss, val_f1, test_f1))

Training runs fine, but no matter how many epochs I train, loss stays the same and I get 0 classification for all nodes.

Any advice you could give as to whether or not my approach is even an appropriate one, or point me to a better example?

FWIW, running on Ubuntu 20.04, CUDA 11.2, PyTorch 1.9.1+cu111, torch-geometric 2.0.1, Python 3.8.10.

Thanks in advance for any help!

Answered by rusty1s

Oct 29, 2021

You need to remove the F.elu call in F.elu(self.conv3(x, edge_index) + self.lin3(x)) as otherwise, the sigmoid cannot produce any values smaller than 0.5.
When using cross_entropy_with_logits, you also do not need to apply sigmoid, as this is already done internally in the loss function.

View full answer

rusty1s · 2021-10-29T06:36:42Z

rusty1s
Oct 29, 2021
Maintainer

You need to remove the F.elu call in F.elu(self.conv3(x, edge_index) + self.lin3(x)) as otherwise, the sigmoid cannot produce any values smaller than 0.5.
When using cross_entropy_with_logits, you also do not need to apply sigmoid, as this is already done internally in the loss function.

1 reply

arpieb Nov 1, 2021
Author

Thanks for that bit of guidance, it looks like I'm getting meaningful results now that I can tune from.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Binary classification of nodes #3405

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Binary classification of nodes #3405

Uh oh!

Uh oh!

arpieb Oct 29, 2021

Replies: 1 comment · 1 reply

Uh oh!

rusty1s Oct 29, 2021 Maintainer

Uh oh!

arpieb Nov 1, 2021 Author

arpieb
Oct 29, 2021

Replies: 1 comment 1 reply

rusty1s
Oct 29, 2021
Maintainer

arpieb Nov 1, 2021
Author