Replies: 1 comment
-
You can try to add normalization between layers. This should help with vanishing gradients. Besides that, I am wondering why you not just implement the conv = HeteroConv({
edge_type: GATConv((-1, -1), 64) for edge_type in edge_types
} |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am going to implement a customized heterogeneous graph attention network, where for each type of relationship, it is trained through a seperate GATConv.
My work seems a bit clumsy, which is to create a model instance for each relationship type through GATConv, as shown in the following code.
And the CustomGATConv is the result of minor modifications to GATConv. More concretely, in CustomGATConv, I apply a linear layer to make sure that the feature dimensions of source and target node are same:
And there are only two extra changes in the forward function, which as shown in the following code:
The trouble is that after training work begins, the gradients of att_src and att_dst are zero (and other gradients are also small, but not zero), so this model doesn't work well. I do not know how to improve it.
Beta Was this translation helpful? Give feedback.
All reactions