@@ -55,10 +55,8 @@ performed (depending on the backend hardware support) once the
5555compilation is complete. However, some optimization operations can only
5656be performed in their entirety during the deployment phase.
5757
58- ![ Layered computer storage
59- architecture] ( ../img/ch08/ch09-storage.png ) {#fig: ch-deploy /fusion-storage}
60-
61- ## Operator Fusion {#sec: ch-deploy /kernel-fusion}
58+ ![ Layered computer storagearchitecture] ( ../img/ch08/ch09-storage.png )
59+ :label : ` ch-deploy/fusion-storage}## Operator Fusion {#sec:ch-deploy/kernel-fusion `
6260
6361Operator fusion involves combining multiple operators in a deep neural
6462network (DNN) model into a new operator based on certain rules, reducing
@@ -69,8 +67,7 @@ The two main performance benefits brought by operator fusion are as
6967follows: First, it maximizes the utilization of registers and caches.
7068And second, because it combines operators, the load/store time between
7169the CPU and memory is reduced. Figure
72- [ 1] ( #fig:ch-deploy/fusion-storage ) {reference-type="ref"
73- reference="fig: ch-deploy /fusion-storage"} shows the architecture of a
70+ :numref:` ch-deploy/fusion-storage ` shows the architecture of a
7471computer's storage system. While the storage capacity increases from the
7572level-1 cache (L1) to hard disk, so too does the time for reading data.
7673After operator fusion is performed, the previous computation result can
@@ -80,57 +77,55 @@ operations on the memory. Furthermore, operator fusion allows some
8077computation to be completed in advance, eliminating redundant or even
8178cyclic redundant computing during forward computation.
8279
83- ![ Convolution + Batchnorm operator
84- fusion ] ( ../img/ch08/ch09-conv-bn-fusion.png ) {#fig : ch-deploy /conv-bn-fusion}
80+ ![ Convolution + Batchnorm operatorfusion ] ( ../img/ch08/ch09-conv-bn-fusion.png )
81+ : label : ` ch-deploy/conv-bn-fusion `
8582
8683To describe the principle of operator fusion, we will use two operators,
8784Convolution and Batchnorm, as shown in Figure
88- [ 2] ( #fig:ch-deploy/conv-bn-fusion ) {reference-type="ref"
89- reference="fig: ch-deploy /conv-bn-fusion"}. In the figure, the
85+ :numref:` ch-deploy/conv-bn-fusion ` . In the figure, the
9086solid-colored boxes indicate operators, the resulting operators after
9187fusion is performed are represented by hatched boxes, and the weights or
9288constant tensors of operators are outlined in white. The fusion can be
9389understood as the simplification of an equation. The computation of
9490Convolution is expressed as Equation
95- [ \[ equ: ch-deploy /conv-equation\] ] ( #equ:ch-deploy/conv-equation ) {reference-type="ref"
96- reference="equ: ch-deploy /conv-equation"}.
91+ :eqref:` ch-deploy/conv-equation ` .
9792
98- $$ \mathbf{Y_{\rm conv}}=\mathbf{W_{\rm conv}}\cdot\mathbf{X_{\rm conv}}+\mathbf{B_{\rm conv}}, \text{equ:ch-deploy/conv-equation} $$
93+ $$
94+ \bm{Y_{\rm conv}}=\bm{W_{\rm conv}}\cdot\bm{X_{\rm conv}}+\bm{B_{\rm conv}} $$
95+ :eqlabel:` equ:ch-deploy/conv-equation `
9996
10097Here, we do not need to understand what each variable means. Instead, we
10198only need to keep in mind that Equation
102- [ \[ equ: ch-deploy /conv-equation\] ] ( #equ:ch-deploy/conv-equation ) {reference-type="ref"
103- reference="equ: ch-deploy /conv-equation"} is an equation for
104- $\mathbf{Y_ {\rm conv}}$ with respect to $\mathbf{X_ {\rm conv}}$, and other
99+ :eqref:` ch-deploy/conv-equation ` is an equation for
100+ $\bm{Y_ {\rm conv}}$ with respect to $\bm{X_ {\rm conv}}$, and other
105101symbols are constants.
106102
107103Equation
108- [ \[ equ: ch-deploy /bn-equation\] ] ( #equ:ch-deploy/bn-equation ) {reference-type="ref"
109- reference="equ: ch-deploy /bn-equation"} is about the computation of
104+ :eqref:` ch-deploy/bn-equation ` is about the computation of
110105Batchnorm:
111106
112- ** equ: ch-deploy /bn-equation:** \
113- $$ \mathbf{Y_{\rm bn}}=\gamma\frac{\mathbf{X_{\rm bn}}-\mu_{\mathcal{B}}}{\sqrt{{\sigma_{\mathcal{B}}}^{2}+\epsilon}}+\beta $$
107+ $$
108+ \bm{Y_{\rm bn}}=\gamma\frac{\bm{X_{\rm bn}}-\mu_{\mathcal{B}}}{\sqrt{{\sigma_{\mathcal{B}}}^{2}+\epsilon}}+\beta $$
109+ :eqlabel:` equ:ch-deploy/bn-equation `
114110
115- Similarly, it is an equation for $\mathbf {Y_ {\rm bn}}$ with respect to
116- $\mathbf {X_ {\rm bn}}$. Other symbols in the equation represent constants.
111+ Similarly, it is an equation for $\bm {Y_ {\rm bn}}$ with respect to
112+ $\bm {X_ {\rm bn}}$. Other symbols in the equation represent constants.
117113
118114As shown in Figure
119- [ 2] ( #fig:ch-deploy/conv-bn-fusion ) {reference-type="ref"
120- reference="fig: ch-deploy /conv-bn-fusion"}, when the output of
115+ :numref:` ch-deploy/conv-bn-fusion ` , when the output of
121116Convolution is used as the input of Batchnorm, the formula of Batchnorm
122- is a function for $\mathbf {Y_ {\rm bn}}$ with respect to $\mathbf {X_ {\rm conv}}$.
123- After substituting $\mathbf {Y_ {\rm conv}}$ into $\mathbf {X_ {\rm bn}}$ and
117+ is a function for $\bm {Y_ {\rm bn}}$ with respect to $\bm {X_ {\rm conv}}$.
118+ After substituting $\bm {Y_ {\rm conv}}$ into $\bm {X_ {\rm bn}}$ and
124119uniting and extracting the constants, we obtain Equation
125- [ \[ equ: ch-deploy /conv-bn-equation-3\] ] ( #equ:ch-deploy/conv-bn-equation-3 ) {reference-type="ref"
126- reference="equ: ch-deploy /conv-bn-equation-3"}.
120+ :eqref:` ch-deploy/conv-bn-equation-3 ` .
127121
128- $$ \mathbf{Y_{\rm bn}}=\mathbf{A}\cdot\mathbf{X_{\rm conv}}+\mathbf{B}, \text{equ:ch-deploy/conv-bn-equation-3} $$
122+ $$
123+ \bm{Y_{\rm bn}}=\bm{A}\cdot\bm{X_{\rm conv}}+\bm{B} $$
124+ :eqlabel:` equ:ch-deploy/conv-bn-equation-3 `
129125
130- Here, $\mathbf {A}$ and $\mathbf {B}$ are two matrices. It can be noticed that
126+ Here, $\bm {A}$ and $\bm {B}$ are two matrices. It can be noticed that
131127Equation
132- [ \[ equ: ch-deploy /conv-bn-equation-3\] ] ( #equ:ch-deploy/conv-bn-equation-3 ) {reference-type="ref"
133- reference="equ: ch-deploy /conv-bn-equation-3"} is a formula for computing
128+ :eqref:` ch-deploy/conv-bn-equation-3 ` is a formula for computing
134129Convolution. The preceding example shows that the computation of
135130Convolution and Batchnorm can be fused into an equivalent Convolution
136131operator. Such fusion is referred to as formula fusion.
@@ -162,13 +157,14 @@ after the fusion --- by 8.5% and 11.7% respectively. Such improvements
162157are achieved without bringing side effects and without requiring
163158additional hardware or operator libraries.
164159
165- ::: {#tab: ch09 /ch09-conv-bn-fusion} < br >
166- Fusion | Sample | Mobilenet-v2 |
167- ---------------| --------| -------------- |
168- Before fusion | 0.035 | 15.415 |
169- After fusion | 0.031 | 13.606 |
160+ ::: {#tab: ch09 /ch09-conv-bn-fusion}
161+ Fusion Sample Mobilenet-v2
162+ --------------- -------- --------------
163+ Before fusion 0.035 15.415
164+ After fusion 0.031 13.606
170165
171- Convolution + Batchnorm inference performance before and after fusion (unit: ms)
166+ : Convolution + Batchnorm inference performance before and after
167+ fusion (unit: ms)
172168:::
173169
174170## Operator Replacement
@@ -180,20 +176,19 @@ type of operators that have the same computational logic but are more
180176suitable for online deployment. In this way, we can reduce the
181177computation workload and compress the model.
182178
183- ![ Replacement of
184- Batchnorm ] ( ../img/ch08/ch09-bn-replace.png ) {#fig : ch-deploy /bn-replace}
179+ ![ Replacement ofBatchnorm ] ( ../img/ch08/ch09-bn-replace.png )
180+ : label : ` ch-deploy/bn-replace `
185181
186- Figure [ 3] ( #fig:ch-deploy/bn-replace ) {reference-type="ref"
187- reference="fig: ch-deploy /bn-replace"} depicts the replacement of
182+ Figure :numref:` ch-deploy/bn-replace ` depicts the replacement of
188183Batchnorm with Scale, which is used as an example to describe the
189184principle of operator replacement. After decomposing Equation
190- [ \[ equ: ch-deploy /bn-equation\] ] ( #equ:ch-deploy/bn-equation ) {reference-type="ref"
191- reference="equ: ch-deploy /bn-equation"} (the Batchnorm formula) and
185+ :eqref:` ch-deploy/bn-equation ` (the Batchnorm formula) and
192186folding the constants, Batchnorm is defined as Equation
193- [ \[ equ: ch-deploy /replace-scale\] ] ( #equ:ch-deploy/replace-scale ) {reference-type="ref"
194- reference="equ: ch-deploy /replace-scale"}
187+ :eqref:` ch-deploy/replace-scale `
195188
196- $$ \mathbf{Y_{bn}}=scale\cdot\mathbf{X_{bn}}+offset, \text{equ:ch-deploy/replace-scale} $$
189+ $$
190+ \bm{Y_{bn}}=scale\cdot\bm{X_{bn}}+offset $$
191+ :eqlabel:` equ:ch-deploy/replace-scale `
197192
198193where ** scale** and ** offsets** are scalars. This simplified formula can
199194be mapped to a Scale operator.
@@ -218,13 +213,12 @@ Common methods of operator reordering include moving cropping operators
218213(e.g., Slice, StrideSlice, and Crop) forward, and reordering Reshape,
219214Transpose, and BinaryOp.
220215
221- ![ Reordering of
222- Crop ] ( ../img/ch08/ch09-crop-reorder.png ) {#fig : ch-deploy /crop-reorder}
216+ ![ Reordering ofCrop ] ( ../img/ch08/ch09-crop-reorder.png )
217+ : label : ` ch-deploy/crop-reorder `
223218
224219Crop is used to cut a part out of the input feature map as the output.
225220After Crop is executed, the size of the feature map is reduced. As shown
226- in Figure [ 4] ( #fig:ch-deploy/crop-reorder ) {reference-type="ref"
227- reference="fig: ch-deploy /crop-reorder"}, moving Crop forward to cut the
221+ in Figure :numref:` ch-deploy/crop-reorder ` , moving Crop forward to cut the
228222feature map before other operators reduces the computation workload of
229223subsequent operators, thereby improving the inference performance in the
230224deployment phase. Such improvement is related to the operator
0 commit comments