text features in graph neural network with large number of data nodes #5762

icedpanda · 2022-10-18T09:25:44Z

icedpanda
Oct 18, 2022

Hi, I would like to use text features as node features for a GNN.

I need to compute all nodes embeddings per batch in order to get all candidate embeddings for a downstream task.

Similar to #2765 but I need to make the language model trainable. (which makes it challenging since a LM will take large memory)

What would be the reasonable approach to this?

This is my pseudocode

# in a forward call
def forward():
    avg_text_emb = []
    # can't load all text to bert
    for text in all_text:
        # assume we have batch of 8
        # bert is unfreezed
        text_embedding = bert(text).pooler_output 
        # average embeddings
        avg_text_emb.append(text_embedding.mean(dim=1))
   # all candidate text embeddings
   # I have 20K nodes and 200K edges
   all_avg_text_emb = torch.stack(avg_text_emb)
  
   # all candidate embeddings from the gnn for a downstream task
   all_node_features = gnn(avg_text_emb, edge_idx)
 
   # downstream task
   score = some_task(all_node_features, node_features)
   return score

Any suggestion would be helpful. Thanks!.

Answered by rusty1s

Oct 18, 2022

I think your code looks reasonable but I don't think it will be possible to train a large language model end-to-end in combination with a GNN for around 20k nodes. Alternatives include

using a more light-weight language embedding model, e.g., bag-of-words or averaging Word2Vec embeddings
Pre-processing the intermediate and final outputs of BERT and using them as input features

View full answer

rusty1s · 2022-10-18T11:06:38Z

rusty1s
Oct 18, 2022
Maintainer

I think your code looks reasonable but I don't think it will be possible to train a large language model end-to-end in combination with a GNN for around 20k nodes. Alternatives include

using a more light-weight language embedding model, e.g., bag-of-words or averaging Word2Vec embeddings
Pre-processing the intermediate and final outputs of BERT and using them as input features

11 replies

rusty1s Nov 16, 2022
Maintainer

You can pass in a list of nodes as a tensor, e.g., sample_from_nodes(torch.tensor([0, 1, 2, 3])).

icedpanda Nov 16, 2022
Author

import torch
from torch_geometric.sampler import NeighborSampler
from torch_geometric.data import Data

edge_index = torch.tensor([
    [0, 1, 1, 2, 2, 3],
    [1, 0, 2, 1, 3, 2],
])

# only 1 neighbour
sampler = NeighborSampler(data, num_neighbors=[1])
out = sampler.sample_from_nodes(torch.tensor([1,2]))
# should expecting 1 neighbour from node_idx 1 and node_index 2

But got

ValueError: not enough values to unpack (expected 3, got 2)

https://pytorch-geometric.readthedocs.io/en/latest/modules/sampler.html

rusty1s Nov 16, 2022
Maintainer

Ah, you are right. The interface slightly changed. Try

import torch

from torch_geometric.data import Data
from torch_geometric.sampler import NeighborSampler

edge_index = torch.tensor([
    [0, 1, 1, 2, 2, 3],
    [1, 0, 2, 1, 3, 2],
])

data = Data(edge_index=edge_index, num_nodes=4)

# only 1 neighbour
sampler = NeighborSampler(data, num_neighbors=[1])
example_idx = torch.arange(4)
node_idx = torch.tensor([1, 2, 3, 4])
out = sampler.sample_from_nodes((example_idx, node_idx, None))

icedpanda Nov 16, 2022
Author

what is example_idx here?

I try to sample_from_noes() 5 times but always get the same neighbors. shouldn't it return different neighbors each time in a uniform way? (e.g. 1->0, 1->2, 3->2)

import torch

from torch_geometric.data import Data
from torch_geometric.sampler import NeighborSampler

edge_index = torch.tensor([
    [0, 1, 1, 2, 2, 3],
    [1, 0, 2, 1, 3, 2],
])

data = Data(edge_index=edge_index, num_nodes=4)

# only 1 neighbour
sampler = NeighborSampler(data, num_neighbors=[1])
example_idx = torch.arange(4)
node_idx = torch.tensor([1, 3])

for i in range(5):
    out = sampler.sample_from_nodes((example_idx, node_idx, None))
    print(out)

# SamplerOutput(node=tensor([1, 3, 0, 2]), row=tensor([2, 3]), col=tensor([0, 1]), edge=tensor([1, 5]), batch=None, metadata=tensor([0, 1, 2, 3]))
# SamplerOutput(node=tensor([1, 3, 0, 2]), row=tensor([2, 3]), col=tensor([0, 1]), edge=tensor([1, 5]), batch=None, metadata=tensor([0, 1, 2, 3]))
# SamplerOutput(node=tensor([1, 3, 0, 2]), row=tensor([2, 3]), col=tensor([0, 1]), edge=tensor([1, 5]), batch=None, metadata=tensor([0, 1, 2, 3]))
# SamplerOutput(node=tensor([1, 3, 0, 2]), row=tensor([2, 3]), col=tensor([0, 1]), edge=tensor([1, 5]), batch=None, metadata=tensor([0, 1, 2, 3]))
# SamplerOutput(node=tensor([1, 3, 0, 2]), row=tensor([2, 3]), col=tensor([0, 1]), edge=tensor([1, 5]), batch=None, metadata=tensor([0, 1, 2, 3]))

btw, could you please point me out the document for the SamplerOutput? I can't find it in the document.

rusty1s Nov 16, 2022
Maintainer

It's defined here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

text features in graph neural network with large number of data nodes #5762

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 11 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

text features in graph neural network with large number of data nodes #5762

Uh oh!

icedpanda Oct 18, 2022

Replies: 1 comment · 11 replies

Uh oh!

rusty1s Oct 18, 2022 Maintainer

Uh oh!

rusty1s Nov 16, 2022 Maintainer

Uh oh!

icedpanda Nov 16, 2022 Author

Uh oh!

rusty1s Nov 16, 2022 Maintainer

Uh oh!

icedpanda Nov 16, 2022 Author

Uh oh!

rusty1s Nov 16, 2022 Maintainer

icedpanda
Oct 18, 2022

Replies: 1 comment 11 replies

rusty1s
Oct 18, 2022
Maintainer

rusty1s Nov 16, 2022
Maintainer

icedpanda Nov 16, 2022
Author

rusty1s Nov 16, 2022
Maintainer

icedpanda Nov 16, 2022
Author

rusty1s Nov 16, 2022
Maintainer