You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I will provide the graph structure, the HGT architecture and the trainning loop used in my approach. First, the issue at hand is a binary classification of tweets based on specific graph features.
import torch_geometric.transforms as T
from torch_geometric.datasets import OGB_MAG
from torch_geometric.nn import HGTConv, Linear
h_c = 256
class HGT(torch.nn.Module):
def __init__(self, hidden_channels, out_channels, num_heads, num_layers):
super().__init__()
self.lin_dict = torch.nn.ModuleDict()
for node_type in data.node_types:
self.lin_dict[node_type] = Linear(-1, hidden_channels)
self.convs = torch.nn.ModuleList()
for _ in range(num_layers):
conv = HGTConv(hidden_channels, hidden_channels, data.metadata(),
num_heads, group='sum')
self.convs.append(conv)
self.dropout2 = torch.nn.Dropout(p=0.3)
self.linear1 = Linear(hidden_channels, h_c)
self.dropout = torch.nn.Dropout(p=0.5)
self.linear2 = Linear(h_c, out_channels)
def forward(self, x_dict, edge_index_dict):
for node_type, x in x_dict.items():
x_dict[node_type] = self.lin_dict[node_type](x).tanh_()
for conv in self.convs:
x_dict = conv(x_dict, edge_index_dict)
x = x_dict['tweet']
x = self.dropout(x)
x = torch.relu(self.linear1(x))
x = self.dropout2(x)
x = self.linear2(x)
return x
model = HGT(hidden_channels=512, out_channels=2,
num_heads=8, num_layers=2)
Preprocessing steps:
import torch_geometric.transforms as T
from torch_geometric.nn import Sequential, Linear
from torch.nn import ReLU
data = T.ToUndirected()(data)
data = T.AddSelfLoops()(data)
data = T.NormalizeFeatures()(data)
transform = T.RandomNodeSplit()
data = transform(data)
Trainning loop
import torch.nn.functional as F
def train():
model.train()
optimizer.zero_grad()
out = model(data.x_dict, data.edge_index_dict)
mask = data['tweet'].train_mask
loss = F.cross_entropy(out[mask], data['tweet'].y[mask])
loss.backward()
optimizer.step()
return float(loss)
Test Loop
@torch.no_grad()
def test():
model.eval()
pred = model(data.x_dict, data.edge_index_dict).argmax(dim=-1)
accs = []
for split in ['train_mask', 'val_mask', 'test_mask']:
mask = data['tweet'][split]
acc = (pred[mask] == data['tweet'].y[mask]).sum() / mask.sum()
accs.append(float(acc))
return accs
Trainning the NN
import matplotlib.pyplot as plt
from tqdm import tqdm
train_accuracies = []
val_accuracies = []
test_accuracies = []
loss_values = []
for epoch in tqdm(range(1, 201), desc="Training Progress"):
loss = train()
train_acc, val_acc, test_acc = test()
loss_values.append(loss)
train_accuracies.append(train_acc)
val_accuracies.append(val_acc)
test_accuracies.append(test_acc)
The previous NN attains an accuracy of 70% and the loss reaches 0.6 (Yes I know it's too high).
In an attempt to diagnose the issue I passed the label (target) as a features and expected the results to reach 100% since the label is literally fed with the features, but still the performance stagnates around 70%.
I want to understand why the model is underperforming so much. I tested the features (x vectors of nodes) with ML classifiers (SVM decision tree) and obtained results around 86-90 % accuracy and F1-score which affirms that the features are pertinent and the issue might possibly be in the NN architecture.
I'm available for any further details needed to clarify the issue at hand and help resolve it.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I will provide the graph structure, the HGT architecture and the trainning loop used in my approach. First, the issue at hand is a binary classification of tweets based on specific graph features.
Graph structure:
NN architecture:
Preprocessing steps:
Trainning loop
Test Loop
Trainning the NN
The previous NN attains an accuracy of 70% and the loss reaches 0.6 (Yes I know it's too high).
In an attempt to diagnose the issue I passed the label (target) as a features and expected the results to reach 100% since the label is literally fed with the features, but still the performance stagnates around 70%.
I want to understand why the model is underperforming so much. I tested the features (x vectors of nodes) with ML classifiers (SVM decision tree) and obtained results around 86-90 % accuracy and F1-score which affirms that the features are pertinent and the issue might possibly be in the NN architecture.
I'm available for any further details needed to clarify the issue at hand and help resolve it.
Beta Was this translation helpful? Give feedback.
All reactions