Skip to content

Commit b5342ce

Browse files
committed
- Better explanetion on debug mode with table
- Found the way hw make an example admonition - Fx8w16d - update info on used accumulator
1 parent 181835e commit b5342ce

File tree

1 file changed

+31
-41
lines changed

1 file changed

+31
-41
lines changed

doc/documents/MLI_FP_data_format/MLI_FP_data_format.rst

Lines changed: 31 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -41,14 +41,13 @@ complemented integer numbers: 8 bit for ``MLI_EL_FX_8`` (also referred to as ``f
4141
of fractional bits (see ``fx.frac_bits`` in :ref:`mli_el_prm_u`),
4242
which corresponds to the second designation above.
4343

44-
.. note::
45-
Example:
44+
.. admonition:: Example
45+
:class: "admonition tip"
4646

47-
Given 0x4000h (16384) value in 16bit container,
48-
49-
• In Q0.15 (and Q.15) format, this represents 0.5
50-
51-
• In Q1.14 (and Q.14) format, this represents 1.0
47+
Given 0x4000h (16384) value in 16bit container,
48+
49+
* In Q0.15 (and Q.15) format, this represents 0.5
50+
* In Q1.14 (and Q.14) format, this represents 1.0
5251
..
5352
5453
For more information on how to get the real value of tensor from fx,
@@ -59,8 +58,8 @@ fractional bits might be larger than total number of containers
5958
significant (not-sign) bits. In this case all bits not present in the
6059
container implied equal to sign bit.
6160

62-
.. note::
63-
Examples:
61+
.. admonition:: Example
62+
:class: "admonition tip"
6463

6564
Given 0x0020 (32) in Q.10 format,
6665

@@ -127,9 +126,8 @@ uses rounding provided by ARCv2 DSP hardware (see :ref:`hw_comp_dpd` ). ``dequan
127126
``real_ val`` in case of immediate forward/backward conversion
128127
due to rounding operation (see examples 2 and 4 from the following example list).
129128

130-
.. note::
131-
132-
Examples:
129+
.. admonition:: Example
130+
:class: "admonition tip"
133131

134132
- Given a real value of 0.85; FX format Q.7; rounding mode nearest, the
135133
FX value is computed as:
@@ -152,9 +150,8 @@ bits requires value shifting: shift left in case of increasing number
152150
of fractional bits, and shift right with rounding in case of
153151
decreasing.
154152

155-
.. note::
156-
157-
Examples:
153+
.. admonition:: Example
154+
:class: "admonition tip"
158155

159156
- Given an FX value 0x24 in Q.8 format (0.140625), the FX value in Q.12
160157
format is computed as:
@@ -181,8 +178,8 @@ format. The width of the integer part of the result is the sum of
181178
widths of integer parts of the opernads. The width of the fractional
182179
part of the result is the sum of widths of fractional parts of the operands.
183180

184-
.. note::
185-
Example:
181+
.. admonition:: Example
182+
:class: "admonition tip"
186183

187184
Given a number x in Q4.3 format (that is, 4 bits for integer and 3 for
188185
fractional part) and a number y in Q5.7 format, ``x*y`` is in Q9.10
@@ -214,9 +211,8 @@ For division, input operands also do not have to be of the same
214211
format. The result has a format containing the difference of bits in
215212
the formats of input operands.
216213

217-
.. note::
218-
219-
Example:
214+
.. admonition:: Example
215+
:class: "admonition tip"
220216

221217
- Given a dividend ``x`` in Q16.16 format and a divisor ``y`` in Q7.10 format,
222218
the format of the result ``x/y`` is Q(16-7).(16-10), or Q9.6 format.
@@ -251,9 +247,8 @@ Where Ceil(\ *x*) function rounds up *x* to the smallest integer value
251247
that is not less than *x*. From notation point of view, these extra
252248
bits are added to integer part.
253249

254-
.. note::
255-
256-
Example:
250+
.. admonition:: Example
251+
:class: "admonition tip"
257252

258253
For 34 values in Q3.4 format to be accumulated, the number of extra
259254
bits are computed as: ceil(log\ :sub:`2` 34)= ceil(5.09) = 6
@@ -285,9 +280,8 @@ be less than or equal to fractional bits for the sum of inputs. This
285280
condition is checked by primitives in debug mode. For more
286281
information, see :ref:`err_codes`.
287282

288-
.. note::
289-
290-
Example:
283+
.. admonition:: Example
284+
:class: "admonition tip"
291285

292286
Given an input tensor of Q.7 format; and weights tensor of Q.3
293287
format, the number of its fractional bits before shift left operation
@@ -337,7 +331,6 @@ Number of available bits depends on operands types:
337331
significant bits for output. Thus for MAC-based kernels, 17
338332
accumulation bits (as 31–(7+7)=17) are available which can be used
339333
to perform up to 2 :sup:`17` = 131072 operations without overflow.
340-
341334
For simple accumulation, 31 – 7 = 24 bits are available which
342335
guaranteed to perform up to 2 :sup:`24` = 16777216 operations without
343336
overflow.
@@ -348,38 +341,36 @@ Number of available bits depends on operands types:
348341
significant bits for output. For MAC-based kernels, 39 – (15+15) = 9
349342
accumulation bits are available, which can be used to perform up to
350343
2 :sup:`9` = 512 operations without overflow.
351-
352344
For simple accumulation, 39 – 15 = 24 bits are available which
353345
perform up to 2 :sup:`24` = 16777216 operations without overflow.
354346

355-
- **FX16 x FX8 operands**: 40-bit depth accumulator is used. For
356-
MAC-based kernels, 39 – (15 + 7) = 39 - 22 = 17 accumulation bits
357-
are available which can be used to perform up to 2 :sup:`17` = 131072 operations
358-
without overflow.
347+
- **FX16 x FX8 operands**: 32-bit depth accumulator is used. For
348+
MAC-based kernels, 31 – (15 + 7) = 31 - 22 = 9 accumulation bits
349+
are available which can be used to perform up to 2 :sup:`9` = 512
350+
operations without overflow.
359351

360-
In general, the number of accumulations required for one output value
352+
In general, the number of accumulations required for one output value
361353
calculation can be easily estimated in advance. Using this information
362354
you can define if the accumulator satisfies requirements or not.
363355

364356
.. note::
365357
- If the available bits are not enough, ensure that you quantize inputs
366358
(including weights for both the operands of MAC) while keeping some
367359
bits unused.
368-
360+
369361
- To reduce the influence of quantization on result, ensure that you
370362
evenly distribute these bits between operands.
371363
..
372364
373-
.. note::
374-
375-
Example:
365+
.. admonition:: Example
366+
:class: "admonition tip"
376367

377368
Given fx16 operands, 2D Convolution layer with 5x5 kernel size on
378369
input with 64 channels, initial Input tensor format being Q.11,
379370
initial weights tensor format being Q.15, each output value of
380371
2D convolution layer requires the following number of accumulations:
381372

382-
``kernel_height(5) \* kernel_width(5) \* input_channels(64) +
373+
``kernel_height(5) * kernel_width(5) * input_channels(64) +
383374
bias_add(1) = 5*5*64+1=1601``
384375

385376
To ensure that the result does not overflow during accumulation, the
@@ -392,8 +383,7 @@ you can define if the accumulator satisfies requirements or not.
392383
and correct number of fractional bits. 2 is an even number and it might
393384
be distributed equally (-1 fractional bit for each operand).
394385

395-
The new number of fractional bits in Input tensor: = 11 – 1 = 10
396-
397-
The new number of fractional bits in Weights tensor: = 15 – 1 = 14
386+
- The new number of fractional bits in Input tensor: = 11 – 1 = 10
387+
- The new number of fractional bits in Weights tensor: = 15 – 1 = 14
398388
..
399389

0 commit comments

Comments
 (0)