-
Hi there, I am doing a link prediction task on multiple graphs. Those graphs are of the same kind, but have distinct numbers of nodes/edges ranging from ~1000 edges to ~30000 edges (graphs are fed to the model in the form of a dictionary). I do not need to do minibatching now, but the graphs may expand later to ~100,000 edges, so it's better to start minibatching now. To properly train the model, it might be better if I can include samples from all graphs in each minibatch. My plan is that I use However, I found that P.S. I didn't plan to use |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
|
Beta Was this translation helpful? Give feedback.
GraphSAINTEdgeSampler
should never sample edges twice. Do you have a small script to reproduce this?You are right that
GraphSAINTEdgeSampler
does not fix the number of batches. This is due to how GraphSAINT is defined, i.e. nodes are potentially re-sampled across iterations.I'm not entirely sure why you need
GraphSAINTEdgeSampler
in the first place. Processing 100k edges should be easily doable in full-batch mode, and you can even useDataLoader
to stack multiple graphs together if GPU memory allows.