Skip to content

Commit e2979d6

Browse files
committed
post : Efficient Attention
proof
1 parent 0080e19 commit e2979d6

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

_posts/DeepLearning/Kernel Fusion/2025-03-07-fused.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,9 +120,10 @@ $$
120120
$$d_i $$ , $$m_i $$ 를 미리 구해놓고 다음 input에 대하여 $$d_j $$ , $$m_j $$ 를 계산하면 위의 공식에 대입하여 update를 할 수 있습니다.
121121
max에 대한 증명은 생략하도록 하겠습니다.
122122

123-
1. $$ Assume \quad d_{j} = \sum_{k=0}^{j} e^{\,x_k - m_i} \quad d_{i} = \sum_{k=j+1}^{j+i} e^{\,x_k - m_j} $$
123+
1. $$ Assume \quad d_{j} = \sum_{k=0}^{j} e^{\,x_k - m_i} \quad d_{i} = \sum_{k=i+1}^{j+i} e^{\,x_k - m_j} \quad m_{i+j} = \max(m_i, m_j) $$
124124
$$d_i \oplus d_{j} = d_i \, e^{\,m_i - \max(m_i, m_j)} \;+\; d_j \, e^{\,m_j - \max(m_i, m_j)} $$
125-
$$= \left(\sum_{j=1}^{S-1} e^{\,x_j - m_{S-1}}\right) e^{\,m_{S-1} - m_S} + e^{\,x_S - m_S} $$
125+
$$= \left(\sum_{k=0}^{i} e^{\,x_k - m_{i}}\right) e^{\,m_{i} - \max(m_i, m_j)} + \left(\sum_{k=i+1}^{i+j} e^{\,x_k - m_{i}}\right) e^{\,m_{j} - \max(m_i, m_j)} $$
126+
$$= \left(\sum_{k=0}^{i} e^{\,x_k - m_{i}}\right) e^{\,\max(m_i, m_j)} + \left(\sum_{k=i+1}^{i+j} e^{\,x_k - m_{i}}\right) e^{\,\max(m_i, m_j)} $$
126127
$$ = \sum_{j=1}^{S} e^{\,x_j - m_S} $$
127128

128129

0 commit comments

Comments
 (0)