Skip to content

Commit b769f1c

Browse files
committed
Heading hierarchy fixed
1 parent 285266a commit b769f1c

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

doc/documents/Examples_Tutorials/Examples_Tutorials.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Manual deployment consists of two main parts:
4343
Each step of the CIFAR-10 example above is described in separate section below.
4444

4545
Instrument the Model to Extract Weights and Data
46-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4747

4848
After we successfully pass basic Caffe CIFAR 10 tutorial with minor changes, we obtain the following files for deployment:
4949

@@ -119,7 +119,7 @@ Here:
119119
Using defined pieces of Python code, you can extract all the required data from the model and adapt it to an embedded MLI based application.
120120

121121
Collect Data Range Statistic for Each Layer
122-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
123123

124124
Quantization process is not only meant to convert weights data to fixed point representation, but also meant to define ranges of all the intermediate data for each layer. For this purpose, run the model on some representative data subset and gather statistics for all intermediate results. It is better to use all training subsets, or even all the dataset.
125125

@@ -172,7 +172,7 @@ A similar range definition is required for model parameters. As weights are fixe
172172
..
173173
174174
Define Q Data Format for Weights and Data for Each Layer
175-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
175+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
176176

177177
MLI supports fixed point format defined by Q-notation (see section MLI Fixed-Point Data Format). The next step is to find the appropriate Q-format of input, output and coefficients for each layer to correctly represent float values. This format is fixed in inference time (at least for constant weights). We define the number of integer bits and fractional bits can be easily derived from it. The following table specifies the derivation of integer bits from CIFAR-10 model statistics:
178178

@@ -297,7 +297,7 @@ For 8-bit operands,you do not need to perform this adjustment unless your MAC se
297297
..
298298
299299
Quantize Weights According to Defined Q-Format
300-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
300+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
301301

302302
After extracting coefficients in numpy array objects and defining Qm.n format for data, define MLI structures for kernels and export the quantized data.
303303

@@ -358,7 +358,7 @@ To describe raw data by tensor structures, see this sample code:
358358
Extract the shape of the data and its rank (number of dimensions) from numpy object. Set the container parameters, including its type and number of fractional bits, according to bit depth that you want to use and integer bits defined earlier. For MAC-based kernels, allocate the number of fractional bits as well for output (`CONV1_OUT_FRAQ_BITS`).
359359

360360
Deploying Operations
361-
~~~~~~~~~~~~~~~~~~~~
361+
^^^^^^^^^^^^^^^^^^^^
362362

363363
To define MLI operations and its parameters for trained graph, start from input data as shown in the figure below.
364364

@@ -606,7 +606,7 @@ When data extracted properly (wrapped into tensors and configuration structures)
606606
Here, you can see the IR tensors for storing intermediate results (ir_tensor_X and ir_tensor_X). They are used in double-buffering style. Each primitive uses only buffers pointed by tensors. Fill the rest of the fields of tensors to provide a valid value to next primitive as input. Hence, before using, output tensor must keep only pointer to buffer and its capacity + number of fractional bits for MAC based operations.
607607

608608
Data Allocation
609-
~~~~~~~~~~~~~~~
609+
^^^^^^^^^^^^^^^
610610

611611
To estimate how much memory is required, and decide where to keep the operands in the address space, consider EM9D based target with AGU and XY memory. Keeping operands in a different memory banks (DCCM, XCCM, YCCM) significantly increases performance. Ensure that you organize data flow properly for this work properly.
612612

0 commit comments

Comments
 (0)