You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: chapter_compiler_frontend/Automatic_Differentiation.md
-225Lines changed: 0 additions & 225 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -197,228 +197,3 @@ implications.
197
197
198
198

199
199
:label:`ch04/ch04-forward-mode-compute-function`
200
-
201
-
Figure :numref:`ch04/ch04-forward-mode-compute-function` elucidates thecomputation process within the forward mode. The sequence of elementaryoperations, derived from the source program, is displayed on the left.Following the chain rule and using established derivative evaluationrules, we sequentially compute each intermediate variable${\dot{v}_i}=\frac{\partial v_i}{\partial x_1}$ from top to bottom, asdepicted on the right. Consequently, this leads to the computation ofthe final variable ${\dot{v}_5}=\frac{\partial y}{\partial x_1}$.In the process of derivative evaluation of a function, we obtain a setof partial derivatives of any output with respect to any input of thisfunction. For a function $f:{\mathbf{R}^n}\to \mathbf{R}^m$, where $n$is the number of independent input variables $x_i$, and $m$ is thenumber of independent output variables $y_i$, the derivative resultscorrespond to the following Jacobian matrix:$$\mathbf{J}_{f}= \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \cdots & \frac{\partial y_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial y_m}{\partial x_1} & \cdots & \frac{\partial y_m}{\partial x_n} \end{bmatrix}$$Each forward pass of function $f$ results in partial derivatives of alloutputs with respect to a single input, represented by the vectorsbelow. This corresponds to one column of the Jacobian matrix. Therefore,executing $n$ forward passes gives us the full Jacobian matrix.$$\begin{bmatrix} \frac{\partial y_1}{\partial x_i} \\ \vdots \\ \frac{\partial y_m}{\partial x_i} \end{bmatrix}$$The forward mode allows us to compute Jacobian-vector products byinitializing $\dot{\mathbf{x}}=\mathbf{r}$ to generate the results for asingle column. As the derivative evaluation rules for elementaryoperations are pre-determined, we know the Jacobian matrix for all theelementary operations. Consequently, by leveraging the chain rule toevaluate the derivatives of $f$ propagated from inputs to outputs, wesecure one column in the Jacobian matrix of the entire network.$$\mathbf{J}_{f}\mathbf{r}= \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \cdots & \frac{\partial y_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial y_m}{\partial x_1} & \cdots & \frac{\partial y_m}{\partial x_n} \end{bmatrix} \begin{bmatrix} r_1 \\ \vdots \\ r_n \end{bmatrix}$$### Reverse ModeFigure :numref:`ch04/ch04-backward-mode-compute` illustrates theautomatic differentiation process in the reverse mode. The sequence ofelementary operations, derived from the source program, is displayed onthe left. Beginning from$\bar{v}_5=\bar{y}=\frac{\partial y}{\partial y}=1$, we sequentiallycompute each intermediate variable${\bar{v}_i}=\frac{\partial y_j}{\partial v_i}$ from bottom to top,
202
-
leveraging the chain rule and established derivative evaluation rules
203
-
(as depicted on the right). Thus, we can compute the final variables
204
-
${\bar{x}_1}=\frac{\partial y}{\partial x_1}$ and
205
-
${\bar{x}_2}=\frac{\partial y}{\partial x_2}$.
206
-
207
-

208
-
:label:`ch04/ch04-backward-mode-compute`
209
-
210
-
Every reverse pass of function $f$ produces partial derivatives of asingle output with respect to all inputs, represented by the followingvectors. This corresponds to a single row of the Jacobian matrix.Consequently, executing $m$ reverse passes gives us the full Jacobianmatrix.
211
-
212
-
$$\begin{bmatrix} \frac{\partial y_j}{\partial x_1} & \cdots & \frac{\partial y_j}{\partial x_n} \end{bmatrix}$$Similarly, we can compute vector-Jacobian products to obtain the resultsfor a single row.$$\mathbf{r}^{T}\mathbf{J}_{f}= \begin{bmatrix} r_1 & \cdots & r_m \end{bmatrix} \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \cdots & \frac{\partial y_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial y_m}{\partial x_1} & \cdots & \frac{\partial y_m}{\partial x_n} \end{bmatrix}$$
213
-
214
-
The quantity of columns and rows in a Jacobian matrix directly
215
-
influences the number of forward and reverse passes needed to solve it
216
-
for a given function $f$. This characteristic is particularly
217
-
significant when determining the most efficient method of automatic
218
-
differentiation.
219
-
220
-
When the function has significantly fewer inputs than outputs
221
-
$(f:{\mathbf{R}^n}\to \mathbf{R}^m, n << m)$, the forward mode proves to
222
-
be more efficient. Conversely, when the function has considerably more
223
-
inputs than outputs $(f:{\mathbf{R}^n}\to \mathbf{R}^m, n >> m)$, the
224
-
reverse mode becomes advantageous.
225
-
226
-
For an extreme case where the function maps from $n$ inputs to a single
227
-
output $f:{\mathbf{R}^n}\to \mathbf{R}$, we can evaluate all the
228
-
derivatives of the output with respect to the inputs
0 commit comments