Skip to content

Commit 8ddfd28

Browse files
authored
doc: Super tiny fix doc math (#1747)
1 parent 7bac369 commit 8ddfd28

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/tutorials/recursive_attention.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ We can also generalize the value on index :math:`i` to index set :math:`I`:
2424
\mathbf{v}(I) = \sum_{i\in I}\textrm{softmax}(s_i) \mathbf{v}_i = \frac{\sum_{i\in I}\exp\left(s_i\right)\mathbf{v}_i}{\exp(s(I))}
2525
2626
The :math:`softmax` function is restricted to the index set :math:`I`. Note that :math:`\mathbf{v}(\{1,2,\cdots, n\})` is the self-attention output of the entire sequence.
27-
The *attention state* of the index set :math:`i` can be defined as a tuple :math:`(s(I), \mathbf{v}(I))`, then we can define a binary **merge** operator :math:`\oplus` of two attention states as ((in practice we will minus $s$ with maximum value to guarantee numerical stability and here we omit them for simplicity):
27+
The *attention state* of the index set :math:`I` can be defined as a tuple :math:`(s(I), \mathbf{v}(I))`, then we can define a binary **merge** operator :math:`\oplus` of two attention states as ((in practice we will minus :math:`s` with maximum value to guarantee numerical stability and here we omit them for simplicity):
2828

2929
.. math::
3030

0 commit comments

Comments
 (0)