@@ -275,16 +275,16 @@ In case `ein > 0` is given, edge features of dimension `ein` will be expected in
275
275
and the attention coefficients will be calculated as
276
276
```math
277
277
\a lpha_{ij} = \f rac{1}{z_i} \e xp(LeakyReLU(\m athbf{a}^T [W_e \m athbf{e}_{j\t o i}; W \m athbf{x}_i; W \m athbf{x}_j]))
278
- ````
278
+ ```
279
279
280
280
# Arguments
281
281
282
282
- `in`: The dimension of input node features.
283
283
- `ein`: The dimension of input edget features. Default 0 (i.e. no edge features passed in the forward).
284
284
- `out`: The dimension of output node features.
285
285
- `σ`: Activation function. Default `identity`.
286
- - `bias`: Learn the additive bias if true. Dafault `true`.
287
- - `heads`: Number attention heads. Dafault `1.
286
+ - `bias`: Learn the additive bias if true. Default `true`.
287
+ - `heads`: Number attention heads. Default `1` .
288
288
- `concat`: Concatenate layer output or not. If not, layer output is averaged over the heads. Default `true`.
289
289
- `negative_slope`: The parameter of LeakyReLU.Default `0.2`.
290
290
- `add_self_loops`: Add self loops to the graph before performing the convolution. Default `true`.
@@ -388,14 +388,14 @@ Implements the operation
388
388
```
389
389
where the attention coefficients ``\a lpha_{ij}`` are given by
390
390
```math
391
- \a lpha_{ij} = \f rac{1}{z_i} \e xp(\m athbf{a}^T LeakyReLU([ W_2 \m athbf{x}_i; W_1 \m athbf{x}_j] ))
391
+ \a lpha_{ij} = \f rac{1}{z_i} \e xp(\m athbf{a}^T LeakyReLU(W_2 \m athbf{x}_i + W_1 \m athbf{x}_j))
392
392
```
393
393
with ``z_i`` a normalization factor.
394
394
395
395
In case `ein > 0` is given, edge features of dimension `ein` will be expected in the forward pass
396
396
and the attention coefficients will be calculated as
397
397
```math
398
- \a lpha_{ij} = \f rac{1}{z_i} \e xp(\m athbf{a}^T LeakyReLU([ W_3 \m athbf{e}_{j\t o i}; W_2 \m athbf{x}_i; W_1 \m athbf{x}_j] )).
398
+ \a lpha_{ij} = \f rac{1}{z_i} \e xp(\m athbf{a}^T LeakyReLU(W_3 \m athbf{e}_{j\t o i} + W_2 \m athbf{x}_i + W_1 \m athbf{x}_j)).
399
399
```
400
400
401
401
# Arguments
@@ -404,8 +404,8 @@ and the attention coefficients will be calculated as
404
404
- `ein`: The dimension of input edget features. Default 0 (i.e. no edge features passed in the forward).
405
405
- `out`: The dimension of output node features.
406
406
- `σ`: Activation function. Default `identity`.
407
- - `bias`: Learn the additive bias if true. Dafault `true`.
408
- - `heads`: Number attention heads. Dafault `1.
407
+ - `bias`: Learn the additive bias if true. Default `true`.
408
+ - `heads`: Number attention heads. Default `1` .
409
409
- `concat`: Concatenate layer output or not. If not, layer output is averaged over the heads. Default `true`.
410
410
- `negative_slope`: The parameter of LeakyReLU.Default `0.2`.
411
411
- `add_self_loops`: Add self loops to the graph before performing the convolution. Default `true`.
@@ -477,7 +477,7 @@ function (l::GATv2Conv)(g::GNNGraph, x::AbstractMatrix, e::Union{Nothing, Abstra
477
477
478
478
479
479
function message (Wix, Wjx, e)
480
- Wx = Wix + Wjx
480
+ Wx = Wix + Wjx # Note: this is equivalent to W * vcat(x_i, x_j) as in "How Attentive are Graph Attention Networks?"
481
481
if e != = nothing
482
482
Wx += reshape (l. dense_e (e), out, heads, :)
483
483
end
0 commit comments