What is Towers in PNAConv? #2573

MattJud · 2021-05-13T15:33:35Z

MattJud
May 13, 2021

Hi,

I've been working with PNAConv lately but having some difficulties in understanding how towers work in PNAConv. I've read the PNAConv paper and also the "Neural Message Passing for Quantum Chemistry" paper where the tower concept was taken. But at this point, I'm still unsure how it actually works. The only thing I know is that it help for faster training but doesn't really know the background of it.

I'm asking this question because I need to reduce the number of towers so that it didn't use so much memory on my GPU. I've been dealing with Job Shop Scheduling Problem's Disjuncitive Graph of size bigger than 6x6 which has 38 nodes and 222 edges. Besides number of tower, I also needed to change parameters such as number of layers and hidden dimension of PNAConv. Otherwise I always ran out of memory in my GPU. The biggest question is does the number of tower affect the performance of PNAConv badly? If yes what parameters should I change first before reducing the number of towers?

Answered by rusty1s

May 13, 2021

I'm curious to understand why you have GPU memory problems when operating on a graph with around 40 nodes :)

The tower argument is similar to the groups argument in torch.nn.Conv2d, where you subdivide your number of features into groups, and each group is solely transformed based on the features inside the same group. This will reduce the number of parameters from in_channels * out_channels to num_groups * (in_channels/num_groups) * (out_channels/num_groups) and, as a result, might prevent overfitting. I personally don't think that the tower size is highly sensible to model performance.

View full answer

rusty1s · 2021-05-13T16:34:34Z

rusty1s
May 13, 2021
Maintainer

I'm curious to understand why you have GPU memory problems when operating on a graph with around 40 nodes :)

The tower argument is similar to the groups argument in torch.nn.Conv2d, where you subdivide your number of features into groups, and each group is solely transformed based on the features inside the same group. This will reduce the number of parameters from in_channels * out_channels to num_groups * (in_channels/num_groups) * (out_channels/num_groups) and, as a result, might prevent overfitting. I personally don't think that the tower size is highly sensible to model performance.

5 replies

MattJud May 13, 2021
Author

Would the number of edges also play a role in me running out of Cuda memory? My nodes basically have 2 features. And make train_loader and test_loader from my dataset. I move the model and data to device. But during the first epoch of my train, I already ran out of memory. The parameters when I ran out of memory were using the pna.py setup, the only difference is I changed self.node_emb with GCNConv and remove self.edge_emb since my edges have no feature. I also introduce dropout in the mlp layer because I feel the model is overfitting during training. I'm not sure if that is the best place to place my dropout layer, but so far it's pretty fine.

Below is the code that I use when I ran out of memory, to run it with my dataset I change the hidden dimension to 50, change the tower to 1, change the layer to 3, and reduce the mlp layer to only 2 layers. My dataset has 4000 graphs, and I do batching with batch_size of 64. 80% of total dataset went into training set.

class Flatten(torch.nn.Module):
    def forward(self, input):
        return input.view(input.size(0))

class PNA(torch.nn.Module):
    def __init__(self, deg):
        super(PNA, self).__init__()

        aggregators = ['mean', 'min', 'max', 'std']
        scalers = ['identity', 'amplification', 'attenuation']
        
        self.conv1 = GCNConv(2, 75)

        self.convs = ModuleList()
        self.batch_norms = ModuleList()
        for _ in range(5):
            conv = PNAConv(in_channels=75, out_channels=75,
                           aggregators=aggregators, scalers=scalers, deg=deg,
                           towers=5, pre_layers=1, post_layers=1,
                           divide_input=False)
            self.convs.append(conv)
            self.batch_norms.append(BatchNorm(75))

        self.mlp = Sequential(Linear(75, 50), 
                              ReLU(), 
                              Linear(50, 25), 
                              ReLU(), 
                              Dropout(p=0.2),
                              Linear(25, 1))
        
        self.flatten = Flatten()

    def forward(self, data):
        # gputil_usage()
        x, edge_index, batch = data.x, data.edge_index, data.batch
        
        x = self.conv1(x, edge_index)
           
        for conv, batch_norm in zip(self.convs, self.batch_norms):
            x = F.relu(batch_norm(conv(x, edge_index)))

        x = global_add_pool(x, batch)
        x = self.mlp(x)
        x = self.flatten(x)
        return x

rusty1s May 14, 2021
Maintainer

Yes, the number of edges also play a role in PNAConv. How many edges are in a single batch on average? You can also try to just decrease your batch_size.

In general, PNAConv is a pretty expensive operator in terms of memory consumption since it utilizes MLPs for message generation and multiple aggregations. There exists more lightweight alternatives such as EGConv.

MattJud May 14, 2021
Author

with a batch size of 64, for Disjunctive Graph of 6x6, there are 14208 edges. I guess it makes sense why I easily ran out of memory. I'll try to reduce the batch_size and also will see EGConv.

Thanks! 👍

MattJud May 14, 2021
Author

I'm trying EGConv but I'm unable to import the class EGConv

cannot import name 'EGConv' from 'torch_geometric.nn' (/home/matthew/anaconda3/lib/python3.8/site-packages/torch_geometric/nn/init.py)

I also already update my torch-geometric to 1.7.0

Edit:
I saw that version 1.7.0 hasn't include EGConv, so how can I update this?

Edit 2:
Ok I manage to make it work by extracting the class from the this repo

rusty1s May 17, 2021
Maintainer

You can also directly install from source:

pip install git+https://github.com/rusty1s/pytorch_geometric.git

What is Towers in PNAConv? #2573

Uh oh!

MattJud May 13, 2021

Replies: 1 comment · 5 replies

Uh oh!

rusty1s May 13, 2021 Maintainer

Uh oh!

Uh oh!

MattJud May 13, 2021 Author

Uh oh!

Uh oh!

rusty1s May 14, 2021 Maintainer

Uh oh!

MattJud May 14, 2021 Author

Uh oh!

Uh oh!

MattJud May 14, 2021 Author

Uh oh!

rusty1s May 17, 2021 Maintainer

MattJud
May 13, 2021

Replies: 1 comment 5 replies

rusty1s
May 13, 2021
Maintainer

MattJud May 13, 2021
Author

rusty1s May 14, 2021
Maintainer

MattJud May 14, 2021
Author

MattJud May 14, 2021
Author

rusty1s May 17, 2021
Maintainer