Skip to content

Commit 1a44daa

Browse files
committed
Update public notes
1 parent d2fbddb commit 1a44daa

File tree

1 file changed

+4
-8
lines changed

1 file changed

+4
-8
lines changed

content/为什么残差网络(Residual Block)看似简单却有效?.md

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,11 @@
66

77
从参考资料中,我们可以从公式角度解释为什么残差网络有效。
88

9-
$$\displaylines{
9+
$$
1010
\frac{\partial L}{\partial X_{Aout}} = \frac{\partial L}{\partial X_{Din}}\frac{\partial X_{Din}}{\partial X_{Aout}}
11-
\\
12-
\\
13-
而 X_{Din} = X_{Aout}+C(B(X_{Aout}))
14-
\\
15-
\\
16-
所以 \frac{\partial L}{\partial X_{Aout}} = \frac{\partial L}{\partial X_{Din}}\left( 1+\frac{\partial X_{Din}}{\partial X_{C}} \frac{\partial X_{C}}{\partial X_{B}} \frac{\partial X_{B}}{\partial X_{Aout}} \right)
17-
}$$
11+
$$
12+
$$而 X_{Din} = X_{Aout}+C(B(X_{Aout}))$$
13+
$$所以 \frac{\partial L}{\partial X_{Aout}} = \frac{\partial L}{\partial X_{Din}}\left( 1+\frac{\partial X_{Din}}{\partial X_{C}} \frac{\partial X_{C}}{\partial X_{B}} \frac{\partial X_{B}}{\partial X_{Aout}} \right)$$
1814

1915
可见,就算在后向传播的时候C-B-A的梯度衰减,但D处的梯度会直接传递到A,所以增强了衰减的
2016
梯度。能够有效增加网络性能,但因为仍是连乘结构,因此依然会受到梯度衰减的影响。

0 commit comments

Comments
 (0)