Representation learning using Graphsage on Heterogenous Graphs #8889
Unanswered
13bmartens
asked this question in
Q&A
Replies: 1 comment 1 reply
-
I think what you do is fully correct. Another unsupervised heterogeneous approach would be via infomax (see the hetero folder for an example). I am not sure what you mean by other nodes do not reflect their proximity though. Do you mean nodes of other node types? This would be expected since they are not trained to do so. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Team,
thank you for the great work on PyG!
I am currently working with a custom heterogenous dataset with different node types in the area of firmographics (describing companies and their hierarchies).
The three main node types are company sites, domestic company entities, and global company entities. These are described both by node properties (employee count, yearly turnover) as well as connections between each other, as well as to a country (country - [is in]-> region) and industry (Site -[has ISIC]-> ISIC Code - [is in]-> ISIC Group -[is in]-> ISIC Division).
Using this graph, I want to learn an embedding that places similar company nodes close together in the embedding space. I would like this learning to happen inductively as unseen companies should be embedded without retraining.
I was able to build the graph, use a LinkNeighborLoader, and define a GraphSage-based GNN.
The thing I am struggling with is defining a proper loss function. I am currently relying on generating negative samples between two node types like done in the example (Company Site and ISIC Code) using the LinkNeighborLoader and this loss calculation:
I am getting good results for the Company Site embedding but the embeddign of other nodes does not reflect their proximity in the graph.
The original GraphSage Paper mentions using random walks for this purpose, I could not find an example of doing this for heterogenous graphs.
How can I train my model on more than one edge type? Are there any other approaches possible with PyG?
Thank you aready for your insights!
Beta Was this translation helpful? Give feedback.
All reactions