Dataloader for HypergraphConv #5911

BlueBug12 · 2022-11-06T07:00:39Z

BlueBug12
Nov 6, 2022

As title, when I used DataLoader to merge data objects of hypergraphs, I found that the hyperedge_index was shifted by the node_num instead of edge_num. As a result, the indices in the mini-batch were wrong.

Here is a example where the problem occurs:

from torch_geometric.data import Data
from torch_geometric.loader import DataLoader
import torch
x_1 = torch.tensor([
   [0],[1],[2],[3]
])
hyperedge_index_1 = torch.tensor([
    [0, 1, 2, 1, 2, 3],
    [0, 0, 0, 1, 1, 1],
])

x_2 = torch.tensor([
   [4],[5],[6],[7]
])
hyperedge_index_2 = torch.tensor([
    [0, 1, 2, 1, 2, 3],
    [0, 0, 0, 1, 1, 1],
])

data_list = []
data_list.append(Data(x=x_1, hyperedge_index=hyperedge_index_1.long()))
data_list.append(Data(x=x_2, hyperedge_index=hyperedge_index_2.long()))
loader = DataLoader(data_list, batch_size=len(data_list))
batch_data = next(iter(loader))
print(batch_data)
print(batch_data.x)
print(batch_data.hyperedge_index)

And the output is

DataBatch(x=[8, 1], hyperedge_index=[2, 12], batch=[8], ptr=[3])
tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6],
        [7]])
tensor([[0, 1, 2, 1, 2, 3, 4, 5, 6, 5, 6, 7],
        [0, 0, 0, 1, 1, 1, 4, 4, 4, 5, 5, 5]])

But I think the correct output should be:

DataBatch(x=[8, 1], hyperedge_index=[2, 12], batch=[8], ptr=[3])
tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6],
        [7]])
tensor([[0, 1, 2, 1, 2, 3, 4, 5, 6, 5, 6, 7],
        [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3]])

Did I use DataLoader in the wrong way or should I use another data loader for hyperGraphConv?

Answered by wsad1

Nov 7, 2022

You need to override the __inc__ function in dataloader. hyper_graph_edge_index[0] needs to be incremented by num_nodes and hyper_graph_edge_index[1] by num_edges while batching multiple graphs. This example from the docs illustrates how that done for bipartite graphs, and you need to do something similar for hyper graphs https://pytorch-geometric.readthedocs.io/en/latest/notes/batching.html#bipartite-graphs.

View full answer

wsad1 · 2022-11-07T06:11:18Z

wsad1
Nov 7, 2022
Maintainer

You need to override the __inc__ function in dataloader. hyper_graph_edge_index[0] needs to be incremented by num_nodes and hyper_graph_edge_index[1] by num_edges while batching multiple graphs. This example from the docs illustrates how that done for bipartite graphs, and you need to do something similar for hyper graphs https://pytorch-geometric.readthedocs.io/en/latest/notes/batching.html#bipartite-graphs.

1 reply

BlueBug12 Nov 7, 2022
Author

That's exactly what I need. Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dataloader for HypergraphConv #5911

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Dataloader for HypergraphConv #5911

Uh oh!

BlueBug12 Nov 6, 2022

Replies: 1 comment · 1 reply

Uh oh!

wsad1 Nov 7, 2022 Maintainer

Uh oh!

BlueBug12 Nov 7, 2022 Author

BlueBug12
Nov 6, 2022

Replies: 1 comment 1 reply

wsad1
Nov 7, 2022
Maintainer

BlueBug12 Nov 7, 2022
Author