Skip to content

Commit 0262b1b

Browse files
dzakharJaccovG
authored andcommitted
[DOCS] Platform specific chapter with the VPX initial template
1 parent 134ca7a commit 0262b1b

File tree

5 files changed

+120
-0
lines changed

5 files changed

+120
-0
lines changed

doc/documents/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Welcome to embARC Machine Learning Inference 2.0 Library Documentation!
1616
mli_kernels/mli_kernels.rst
1717
data_movement/data_movement.rst
1818
utility_functions/utility_functions.rst
19+
platform_specific/platform_hint_sum.rst
1920

2021
Indices and tables
2122
==================
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
Platform Specific Details
2+
=========================
3+
4+
Implementation of the MLI Library for the specific platform may put extra restrictions on data
5+
for optimization reasons or due to hardware limitations. This section collects such hints that
6+
need to be taken into account in addition to generic MLI API.
7+
8+
.. toctree::
9+
:maxdepth: 2
10+
11+
vpx.rst
12+
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
ARC VPX Specific Details
2+
-------------------------
3+
4+
The ARC VPX family of processors combines the ARCv2 baseline ISA with ARCv2 Vector DSP ISA extension.
5+
The latter one is actively used in MLI Library implementation for this family of processors,
6+
allowing us to achieve high efficiency.
7+
8+
- :ref:`vpx_mem_alloc`
9+
- :ref:`vpx_mem_allign`
10+
- :ref:`vpx_accum`
11+
- :ref:`vpx_op_limits_shift`
12+
13+
14+
.. _vpx_mem_alloc:
15+
16+
VPX Memory Allocation
17+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
18+
19+
Implementation of almost all kernels uses vector instructions and assumes presence of operands
20+
in the vector memory (VCCM). Which means that:
21+
22+
- A memory location reference by a data container of all input and output tensors must be allocated
23+
within VCCM memory region.
24+
25+
- Memory pointed to by data container of the ``mli_lut`` structure must be allocated within
26+
VCCM memory region.
27+
28+
- Tensors structures, LUT structures, configuration structures and memory pointed to
29+
by containers inside ``el_params`` field of a tensor may be allocated within any memory region.
30+
31+
This applies to:
32+
- All functions from kernels group (see :ref:`mli_kernels`)
33+
- All functions related to conversion group (see :ref:`mli_convert`)
34+
35+
This doesn't apply to:
36+
- All functions from helpers group (see :ref:`mli_helpers`)
37+
- All functions from move group (see :ref:`data_mvmt`)
38+
39+
.. _vpx_mem_allign:
40+
41+
VPX Memory Allignement
42+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
43+
44+
Addresses of all elements including data, quantization parameters and structure fields
45+
must be aligned on an element boundary. This is also applicable for data allocated in the
46+
vector memory (VCCM). Addresses of vectors and vector elements must be properly aligned
47+
on a vector-element boundary.
48+
49+
.. important::
50+
51+
There is one type of memory access that has 8-bit alignment: a unit-stride vector load or store
52+
with 8-bit elements (``fx8`` and ``sa8`` data). For the best performance vector load
53+
and store access for such data must use even byte addresses (aligned on 16-bit boundary).
54+
This can be achieved by using even shapes or memstrides for ``sa8`` and ``fx8`` tensors.
55+
Odd byte addresses are allowed but less efficient.
56+
57+
..
58+
59+
60+
61+
.. _vpx_accum:
62+
63+
Accumulator
64+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65+
66+
The accumulator width used in calculations depends on the ``Xvec_guard_bit_option``
67+
HW configuration parameter. See :ref:`quant_accum_infl` section for more info on how
68+
it influence the usage of the library. The following table summaries available options an
69+
d how much accumulations it allows to do without overflow.
70+
71+
.. table:: VPX HW Accumulator width
72+
:align: center
73+
74+
+-------------------+---------------+---------------------------+---------------------------+---------------------------+
75+
| **Kernel Type** | | **guard bit option = 2** | **guard bit option = 1** | **guard bit option = 0** |
76+
+===================+===============+===========================+===========================+===========================+
77+
| ``sa8`` | Accum width | 24 (8 guard bits) | 20 (4 guard bits) | 16 (0 guard bits) |
78+
| +---------------+---------------------------+---------------------------+---------------------------+
79+
| | MACs w/o | | | |
80+
| | overflow | 256 | 16 | 1 |
81+
| | guaranty | | | |
82+
+-------------------+---------------+---------------------------+---------------------------+---------------------------+
83+
| ``fx16`` | Accum width | 40 (8 guard bits) | 36 (4 guard bits) | 32 (0 guard bits) |
84+
| +---------------+---------------------------+---------------------------+---------------------------+
85+
| | MACs guaranty | | | |
86+
| | | 256 | 16 | 1 |
87+
| | | | | |
88+
+-------------------+---------------+---------------------------+---------------------------+---------------------------+
89+
| ``fx16_fx8_fx8`` | Accum width | 40 (16 guard bits) | 36 (12 guard bits) | 32 (8 guard bits) |
90+
| +---------------+---------------------------+---------------------------+---------------------------+
91+
| | MACs guaranty | | | |
92+
| | | 65536 | 4096 | 256 |
93+
| | | | | |
94+
+-------------------+---------------+---------------------------+---------------------------+---------------------------+
95+
96+
97+
..
98+
99+
.. _vpx_op_limits_shift:
100+
101+
Operands Limitations and Shifting Ranges
102+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
103+

doc/documents/utility_functions/util_data_conv.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _mli_convert:
2+
13
Data Conversion Group
24
---------------------
35

doc/documents/utility_functions/util_help_func.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _mli_helpers:
2+
13
Helper Functions Group
24
----------------------
35

0 commit comments

Comments
 (0)