You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My task involves binary classification of tweets using text embedding and user account features. The approach utilizes a graph that represents users and their authored tweets, as shown in the following data structure.
The code runs smoothly, but I am disappointed with the performance of the neural network (NN). It only achieves 69% accuracy, and the loss does not drop below 0.58 even after 1000 epochs. To investigate the issue, I tested the quality of the features by feeding them to various machine learning classifiers such as random forest and decision tree. Surprisingly, I achieved 84% accuracy with minimal effort. The confusion arises from the fact that the graph-based approach, which has access to additional features such as graph information and user account features, performs worse compared to an approach that only uses tweet text embedding with a machine learning classifier, which achieves 84% accuracy. Furthermore, regardless of the features I feed to the neural network, the final performance consistently remains around 68-69% accuracy. Below, you'll find the architecture of the NN used.
import torch_geometric.transforms as T
from torch_geometric.datasets import OGB_MAG
from torch_geometric.nn import HGTConv, Linear
h_c = 256
class HGT(torch.nn.Module):
def __init__(self, hidden_channels, out_channels, num_heads, num_layers):
super().__init__()
self.lin_dict = torch.nn.ModuleDict()
for node_type in data.node_types:
self.lin_dict[node_type] = Linear(-1, hidden_channels)
self.convs = torch.nn.ModuleList()
for _ in range(num_layers):
conv = HGTConv(hidden_channels, hidden_channels, data.metadata(),
num_heads, group='sum')
self.convs.append(conv)
self.linear1 = Linear(hidden_channels, h_c)
self.dropout = torch.nn.Dropout(p=0.5)
self.linear2 = Linear(h_c, out_channels)
def forward(self, x_dict, edge_index_dict):
for node_type, x in x_dict.items():
x_dict[node_type] = self.lin_dict[node_type](x).relu_()
for conv in self.convs:
x_dict = conv(x_dict, edge_index_dict)
x = x_dict['tweet']
x = self.linear1(x).relu_()
x = self.dropout(x)
x = self.linear2(x)
return x
model = HGT(hidden_channels=512, out_channels=2,
num_heads=8, num_layers=1)
I am unsure whether the issue lies in the low density of the graph, a mistake in the process of feeding feature vectors to the neural network, or something else.
Stats about the graph:
Number of nodes: 4811
Number of edges: 2758
Average node degree: 1.14
Maximum node degree: 77
Minimum node degree: 1
If you need any additional information I'll be happy to provide them.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
My task involves binary classification of tweets using text embedding and user account features. The approach utilizes a graph that represents users and their authored tweets, as shown in the following data structure.
The code runs smoothly, but I am disappointed with the performance of the neural network (NN). It only achieves 69% accuracy, and the loss does not drop below 0.58 even after 1000 epochs. To investigate the issue, I tested the quality of the features by feeding them to various machine learning classifiers such as random forest and decision tree. Surprisingly, I achieved 84% accuracy with minimal effort. The confusion arises from the fact that the graph-based approach, which has access to additional features such as graph information and user account features, performs worse compared to an approach that only uses tweet text embedding with a machine learning classifier, which achieves 84% accuracy. Furthermore, regardless of the features I feed to the neural network, the final performance consistently remains around 68-69% accuracy. Below, you'll find the architecture of the NN used.
I am unsure whether the issue lies in the low density of the graph, a mistake in the process of feeding feature vectors to the neural network, or something else.
Stats about the graph:
If you need any additional information I'll be happy to provide them.
Beta Was this translation helpful? Give feedback.
All reactions