-
Hi, I am currently considering the backward propagation process about the sampling case. For example, considering large scale dataset, the training process is normally like this: (https://github.com/pyg-team/pytorch_geometric/blob/master/examples/ogbn_products_sage.py)
We compute the loss using only the target nodes |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Yes, because the model aggregates node features from neighbor nodes to guess seed nodes' labels. However, neighbor nodes' labels don't contribute to updating models parameters as they're not used to compute the loss. This doc page might be a good reference: https://pytorch-geometric.readthedocs.io/en/latest/tutorial/neighbor_loader.html |
Beta Was this translation helpful? Give feedback.
Hey. While the loss is computed on the seed nodes only, all nodes within the graph that contribute to the final node representation of seed nodes will receive a gradient and are thus used to update model parameters.