Replies: 2 comments 2 replies
-
It looks like the model suffers seriously from the imbalanced class distribution. You can use imbalanced_sampler to address the class-imbalance problem. Also, it is better to use AUC instead of ACC as an evaluation metric for the imbalanced scenario. |
Beta Was this translation helpful? Give feedback.
2 replies
-
Thank you! I will try different losses as you suggested. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am working with a Hetero graph for node classification task. Its a very big dataset with ~13252659 nodes. Out of these 13205523 nodes are from class 0 and 4698 nodes are from class 1 (rest of the nodes are unlabeled), thus its a highly imbalanced graph.
I am using the code as here: https://github.com/pyg-team/pytorch_geometric/blob/master/examples/hetero/hetero_conv_dblp.py.
I am training the GNN in semi-supervised setting and using 4L nodes from class 0 and 4K nodes from class 1 to train the model.
I am testing my model on the entire dataset (except the unlabeled nodes).
To tackle the class imbalance problem, I have just changed the loss in the link above to the weight binary cross entropy loss.
Over the training epochs, the loss of model is going down monotonously and accuracy is also improving. But when I check the class wise stats, mainly confusion matrix (precision, recall, F1 score) on the test test at the end of the training process, it looked as below:
**Epoch: 040, Loss: 0.0692, Train: 0.9831, Test: 0.9801
class 0 1.0000 0.9831 0.9915 13205523
class 1 0.0203 0.9853 0.0398 4698
accuracy 0.9831 13210221
macro avg 0.5102 0.9842 0.5156 13210221
weighted avg 0.9996 0.9831 0.9911 13210221**
Could you please guide me what could I possibly try to improve the model? I am stuck for many days but no luck :(
Thanks so much! :)
Beta Was this translation helpful? Give feedback.
All reactions