Skip to content

Commit a7affba

Browse files
committed
Fix equation ref
1 parent c53b23f commit a7affba

File tree

1 file changed

+3
-7
lines changed

1 file changed

+3
-7
lines changed

chapter_model_deployment/Model_Compression.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -319,9 +319,7 @@ network has fewer parameters than the teacher network.
319319

320320
 [@Distill] proposed KD, which makes the classification result of the
321321
student network more closely resembles the ground truth as well as the
322-
classification result of the teacher network, that is, Equation
323-
[\[c2Fcn:distill\]](#c2Fcn:distill){reference-type="ref"
324-
reference="c2Fcn:distill"}.
322+
classification result of the teacher network, that is, Equation :eqref:`c2Fcn:distill`.
325323

326324
$$\mathcal{L}_{KD}(\theta_S) = \mathcal{H}(o_S,\mathbf{y}) +\lambda\mathcal{H}(\tau(o_S),\tau(o_T)),
327325
$$
@@ -330,16 +328,14 @@ $$
330328
where $\mathcal{H}(\cdot,\cdot)$ is the cross-entropy function, $o_S$
331329
and $o_T$ are outputs of the student network and the teacher network,
332330
respectively, and $\mathbf{y}$ is the label. The first item in
333-
Equation [\[c2Fcn:distill\]](#c2Fcn:distill){reference-type="ref"
334-
reference="c2Fcn:distill"} makes the classification result of the
331+
Equation :eqref:`c2Fcn:distill` makes the classification result of the
335332
student network resemble the expected ground truth, and the second item
336333
aims to extract useful information from the teacher network and transfer
337334
the information to the student network, $\lambda$ is a weight parameter
338335
used to balance two objective functions, and $\tau(\cdot)$ is a soften
339336
function that smooths the network output.
340337

341-
Equation [\[c2Fcn:distill\]](#c2Fcn:distill){reference-type="ref"
342-
reference="c2Fcn:distill"} only extracts useful information from the
338+
Equation :eqref:`c2Fcn:distill` only extracts useful information from the
343339
output of the teacher network classifier --- it does not mine
344340
information from other intermediate layers of the teacher network.
345341
Romero et al. [@FitNet] proposed an algorithm for transferring useful

0 commit comments

Comments
 (0)