GAT: Softmax should be on one edge type or all edge types? #8364

edward94587 · 2023-11-10T22:05:55Z

edward94587
Nov 10, 2023

This is to solve a heterogeneous node classification problem. According to eq2 of the paper by Velickovi et al. "Graph Attention Networks", attention coefficients are obtained by doing a softmax on all of node i's neighbors.
https://arxiv.org/pdf/1710.10903.pdf

Because the paper did not specify that eq2's softmax normalization is for only one edge type, it is logical to infer that it is for all edge types. This makes sense since it normalizes the attention score before using it as weights to update node i's representation in eq4.

However, the code in gat_conv.py normalizes attention coefficients for each edge type individually and not for all edge types in one shot. This is implemented in line 274 of file gat_conv.py:
# https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/nn/conv/gat_conv.py

    # edge_updater_type: (alpha: OptPairTensor, edge_attr: OptTensor)
    alpha = self.edge_updater(edge_index, alpha=alpha, edge_attr=edge_attr)  # <--- line 274

    # propagate_type: (x: OptPairTensor, alpha: Tensor)
    out = self.propagate(edge_index, x=x, alpha=alpha, size=size)            # <--- line 277

    if self.concat:
        out = out.view(-1, self.heads * self.out_channels)
    else:
        out = out.mean(dim=1)

    if self.bias is not None:
        out = out + self.bias                               # <--- line 285

where input parameter "edge_index" in line 274 is the edge list only for one edge type.
What makes matters worse is line 285 would be adding a bias for each edge type as explained below.

Take for example a node i has 2 edge types (e1 and e2) sinking into it.
For edge type e1, line 274 (and line 285) above would be executed because GatConv.forward() would be called from HeteroConv.forward() in line 158 of file hetero_conv.py below. Here, the "out" of line 277 above is the updated node i's representation for edge type e1, BUT using attention coefficients, 'alpha', from line 274, which was normalized only using edge type e1.

      # https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/nn/conv/hetero_conv.py
        out = conv(*args, **kwargs)                        # <--- line 158, calls GatConv.forward()

        if dst not in out_dict:
            out_dict[dst] = [out]
        else:
            out_dict[dst].append(out)

    for key, value in out_dict.items():
        out_dict[key] = group(value, self.aggr)       # <--- line 166, accumulates "out" tensor of different edge types

    return out_dict

For edge type e2, the same execution is done, and another "out" tensor of line 277 is calculated for the updated node i's representation for edge type e2, BUT only using attention coefficients normalized only using edge type e2. Also, a bias is added as in the case of edge e1.

Line 166 above is where these 2 'out' tensors are accumulated. This would have the following issues:

A node's representation that is a result of accumulating 2 'out' tensors both of which were generated by normalized weights (in the case of 2 edge types) is very different from another node's representation which was generated by only one normalized weights (in the case of one edge type). How do you compare these two?
A node with more edge types sinking into it would result in more vector parallel shifts. This is because for each edge type the same bias is added.

seungjae5044 · 2023-11-18T06:47:11Z

seungjae5044
Nov 18, 2023

The inference based on eq2 of the paper suggests that softmax is applied to all types of edges. However, upon inspecting the implementation provided in the paper, it is observed that the authors apply softmax only to a single type of edge.

https://github.com/PetarV-/GAT/blob/master/utils/layers.py

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GAT: Softmax should be on one edge type or all edge types? #8364

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GAT: Softmax should be on one edge type or all edge types? #8364

Uh oh!

Uh oh!

edward94587 Nov 10, 2023

Replies: 1 comment

Uh oh!

seungjae5044 Nov 18, 2023

edward94587
Nov 10, 2023

seungjae5044
Nov 18, 2023