We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent a6a6630 commit 1762f0dCopy full SHA for 1762f0d
bayesflow/attention.py
@@ -34,7 +34,7 @@ class MultiHeadAttentionBlock(tf.keras.Model):
34
35
def __init__(self, input_dim, attention_settings, num_dense_fc, dense_settings, use_layer_norm, **kwargs):
36
"""Creates a multihead attention block which will typically be used as part of a
37
- set transformer architecture according to [1].
+ set transformer architecture according to [1]. Corresponds to standard cross-attention.
38
39
Parameters
40
----------
0 commit comments