Skip to content

Commit 16b9fe1

Browse files
typos
1 parent 739c14f commit 16b9fe1

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/attention.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ const AA4{T} = AbstractArray{T,4}
33
const AA{N,T} = AbstractArray{T,N}
44

55
"""
6-
dot_product_attention(query, key, value [bias]; fdrop, mask, nheads])
6+
dot_product_attention(query, key, value, [bias]; [fdrop, mask, nheads])
77
88
Multihead dot product attention used in transformer architectures.
99
@@ -24,7 +24,7 @@ See also [`dot_product_attention_scores`](@ref) if you only need the attention s
2424
It will be added to the attention scores before applying the softmax. Default `nothing`.
2525
- `fdrop`: A dropout function or layer to apply on the attention scores. Default `identity` (no dropout).
2626
- `mask`: Either `nothing` or a boolean array broadcastable to size `(kv_len, q_len, nheads, batch_size)`.
27-
The mask be applied to the attention scores before applying the softmax.
27+
The mask is applied to the attention scores before the softmax.
2828
Can also be set to `mask=:causal` to apply a causal mask. Default `nothing`.
2929
- `nheads`: Number of heads to split the input arrays into. Default `1`.
3030

0 commit comments

Comments
 (0)