HGT underperforming drastically on hetero graph, unable to diagnose the issue. #8011

SHamda · 2023-09-10T11:41:03Z

SHamda
Sep 10, 2023

I will provide the graph structure, the HGT architecture and the trainning loop used in my approach. First, the issue at hand is a binary classification of tweets based on specific graph features.

Graph structure:

data = HeteroData()
data['word'].x = words_features.float()
data['root'].x = root_features.float()
data['lemma'].x = lemma_features.float()
data['user'].x = user_features.float()
data['tweet'].x = tweet_features.float()
data['user', 'writes', 'tweet'].edge_index = edge_index_t_u
data['word', 'links', 'lemma'].edge_index = edge_index_w_l
data['lemma', 'links', 'root'].edge_index = edge_index_l_r
data['word', 'in', 'tweet'].edge_index = edge_index_w_t
data['user'].y = None
data['tweet'].y = torch.tensor(df_tweets['label'], dtype=torch.long)

NN architecture:

import torch_geometric.transforms as T
from torch_geometric.datasets import OGB_MAG
from torch_geometric.nn import HGTConv, Linear
h_c = 256
class HGT(torch.nn.Module):
    def __init__(self, hidden_channels, out_channels, num_heads, num_layers):
        super().__init__()

        self.lin_dict = torch.nn.ModuleDict()
        for node_type in data.node_types:
            self.lin_dict[node_type] = Linear(-1, hidden_channels)

        self.convs = torch.nn.ModuleList()
        for _ in range(num_layers):
            conv = HGTConv(hidden_channels, hidden_channels, data.metadata(),
                           num_heads, group='sum')
            self.convs.append(conv)
        self.dropout2 = torch.nn.Dropout(p=0.3)
        self.linear1 = Linear(hidden_channels, h_c)
        self.dropout = torch.nn.Dropout(p=0.5)
        self.linear2 = Linear(h_c, out_channels)

    def forward(self, x_dict, edge_index_dict):
        for node_type, x in x_dict.items():
            x_dict[node_type] = self.lin_dict[node_type](x).tanh_()

        for conv in self.convs:
            x_dict = conv(x_dict, edge_index_dict)

        x = x_dict['tweet']
        x = self.dropout(x)
        x = torch.relu(self.linear1(x))
        x = self.dropout2(x)
        x = self.linear2(x)

        return x
model = HGT(hidden_channels=512, out_channels=2,
            num_heads=8, num_layers=2)

Preprocessing steps:

import torch_geometric.transforms as T
from torch_geometric.nn import Sequential, Linear
from torch.nn import ReLU

data = T.ToUndirected()(data)
data = T.AddSelfLoops()(data)
data = T.NormalizeFeatures()(data)

transform = T.RandomNodeSplit()
data = transform(data)

Trainning loop

import torch.nn.functional as F
def train():
    model.train()
    optimizer.zero_grad()
    out = model(data.x_dict, data.edge_index_dict)
    mask = data['tweet'].train_mask
    loss = F.cross_entropy(out[mask], data['tweet'].y[mask])
    loss.backward()
    optimizer.step()
    return float(loss)

Test Loop

@torch.no_grad()
def test():
    model.eval()
    pred = model(data.x_dict, data.edge_index_dict).argmax(dim=-1)

    accs = []
    for split in ['train_mask', 'val_mask', 'test_mask']:
        mask = data['tweet'][split]
        acc = (pred[mask] == data['tweet'].y[mask]).sum() / mask.sum()
        accs.append(float(acc))
    return accs

Trainning the NN

import matplotlib.pyplot as plt
from tqdm import tqdm
train_accuracies = []
val_accuracies = []
test_accuracies = []
loss_values = []

for epoch in tqdm(range(1, 201), desc="Training Progress"):
    loss = train()
    train_acc, val_acc, test_acc = test()
    loss_values.append(loss)
    train_accuracies.append(train_acc)
    val_accuracies.append(val_acc)
    test_accuracies.append(test_acc)

The previous NN attains an accuracy of 70% and the loss reaches 0.6 (Yes I know it's too high).
In an attempt to diagnose the issue I passed the label (target) as a features and expected the results to reach 100% since the label is literally fed with the features, but still the performance stagnates around 70%.

I want to understand why the model is underperforming so much. I tested the features (x vectors of nodes) with ML classifiers (SVM decision tree) and obtained results around 86-90 % accuracy and F1-score which affirms that the features are pertinent and the issue might possibly be in the NN architecture.

I'm available for any further details needed to clarify the issue at hand and help resolve it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HGT underperforming drastically on hetero graph, unable to diagnose the issue. #8011

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

HGT underperforming drastically on hetero graph, unable to diagnose the issue. #8011

Uh oh!

SHamda Sep 10, 2023

Replies: 0 comments

SHamda
Sep 10, 2023