Skip to content

Commit f6979d1

Browse files
committed
PointToSubtensor section is commented as it's a subject for change
1 parent 6fd0648 commit f6979d1

File tree

9 files changed

+100
-91
lines changed

9 files changed

+100
-91
lines changed

doc/documents/mli_api_data/kernel_sp_conf_struct.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,9 @@ describe fields of existing MLI configuration structures:
3030
- Table :ref:`t_mli_prelu_cfg_desc`
3131

3232
- Table :ref:`t_mli_mov_cfg_desc`
33-
34-
- Table :ref:`t_mli_sub_tensor_cfg_desc`
33+
34+
..
35+
- Table :ref:`t_mli_sub_tensor_cfg_desc`
3536
3637

3738

doc/documents/mli_kernels/conv_2d.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ convolution parameters (such as padding or stride), inputs and weights shape.
2525
..
2626
2727
Optionally, saturating ReLU activation function can be applied to the result of the
28-
convolution during the functions execution. For more information on supported ReLU types
28+
convolution during the function's execution. For more information on supported ReLU types
2929
and calculations, see :ref:`relu_prot`.
3030

3131
This is a MAC-based kernel which implies accumulation. See :ref:`quant_accum_infl` for more information on

doc/documents/mli_kernels/conv_depthwise.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ filters for each channel of input. Such functionality refers to group convolutio
3535
and can be obtained by the corresponding kernel (see :ref:`grp_conv`).
3636

3737
Optionally, a saturating ReLU activation function can be applied to the result of the
38-
convolution during the functions execution. For more information on supported ReLU types
38+
convolution during the function's execution. For more information on supported ReLU types
3939
and calculations, see :ref:`relu_prot`.
4040

4141
This is a MAC-based kernel which implies accumulation. See :ref:`quant_accum_infl` for more information

doc/documents/mli_kernels/conv_grp.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ number of filters per each group.
2525
..
2626
2727
Optionally, saturating ReLU activation function can be applied to the result of
28-
the convolution during the functions execution. For more information on supported ReLU
28+
the convolution during the function's execution. For more information on supported ReLU
2929
types and calculations, see :ref:`relu_prot`.
3030

3131
This is a MAC-based kernel which implies accumulation. See :ref:`quant_accum_infl` for more information on related quantization aspects.

doc/documents/mli_kernels/conv_transp.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ For more details on calculations, see chapter 4 of `A guide to convolution
77
arithmetic for deep learning <https://arxiv.org/abs/1603.07285>`_.
88

99
Optionally, a saturating ReLU activation function can be applied to the
10-
result of the convolution during the functions execution. For more info
10+
result of the convolution during the function's execution. For more info
1111
on supported ReLU types and calculations, see :ref:`relu_prot`.
1212

1313
The ``dilation_height`` and ``dilation_width`` parameter of ``mli_conv2d_cfg``

doc/documents/mli_kernels/introduction.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ The slicing concept is illustrated in Figure :ref:`f_slicing_concept`.
4949
Slicing Concept
5050
..
5151
52-
If the tensors dont fit into CCM, and there is no data cache, the data move functions can
52+
If the tensors don't fit into CCM, and there is no data cache, the data move functions can
5353
be used to copy full tensors or slices of tensors. (see Chapter :ref:`data_mvmt` ). Slicing
5454
with some kernels requires updating the kernel parameters when passing each slice.
5555

doc/documents/mli_kernels/rec_fully_con.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,18 +21,18 @@ Each value of output tensor is calculated according to the following formula:
2121
2222
Where:
2323

24-
:math:`x_{j}` ** :math:`j_{\text{th}}` *value in input tensor*
24+
:math:`x_{j}` *-* :math:`j_{\text{th}}` *value in input tensor*
2525

26-
:math:`y_{i}` * output of* :math:`i_{\text{th}}` neuron
26+
:math:`y_{i}` *- output of* :math:`i_{\text{th}}` neuron
2727
(:math:`i_{\text{th}}` *value in output tensor)*
2828

29-
:math:`W_{i,j}` * weight of* :math:`j_{\text{th}}\ `\ *input element
29+
:math:`W_{i,j}` *- weight of* :math:`j_{\text{th}}\ `\ *input element
3030
for* :math:`i_{\text{th}}` *neuron.*
3131

32-
:math:`b_{i}` * bias for* :math:`i_{\text{th}}` *neuron*
32+
:math:`b_{i}` *- bias for* :math:`i_{\text{th}}` *neuron*
3333

3434
Optionally, a saturating ReLU activation function can be applied to the result of the calculations
35-
during the functions execution. For more information on supported ReLU types, see :ref:`relu_prot`.
35+
during the function's execution. For more information on supported ReLU types, see :ref:`relu_prot`.
3636

3737
This is a MAC-based kernel which implies accumulation. See :ref:`quant_accum_infl` for more information on related quantization aspects.
3838
The Number of accumulation series is equal to input size.

doc/documents/mli_kernels/rec_rnn_dense.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,21 +13,21 @@ typically used in the majority of RNN architectures:
1313
1414
Where:
1515

16-
:math:`{xa}_{j}`, :math:`{xb}_{j}`, :math:`{xn}_{j}` **
16+
:math:`{xa}_{j}`, :math:`{xb}_{j}`, :math:`{xn}_{j}` *-*
1717
:math:`j_{\text{th}}` *value in one of the input tensors. These input
1818
tensors might be current input, previous output, cell state or any other
1919
tensor depending on RNN Cell architecture*
2020

21-
:math:`{Wa}_{i,j}`, :math:`{Wb}_{i,j}`, :math:`{Wc}_{i,j}` * weight
21+
:math:`{Wa}_{i,j}`, :math:`{Wb}_{i,j}`, :math:`{Wc}_{i,j}` *- weight
2222
of* :math:`j_{th}\ `\ *input element for*
2323
:math:`i_{th}` *neuron in one of input weights tensors. These
2424
weights tensors might be input-to-a-gate weights, output-to-a-gate
2525
weights or any other tensor depending on RNN Cell architecture*
2626

27-
:math:`y_{i}` * output of* :math:`i_{th}` neuron
27+
:math:`y_{i}` *- output of* :math:`i_{th}` neuron
2828
( :math:`i_{th}` *value in output tensor).*
2929

30-
:math:`b_{i}` * bias for* :math:`i_{th}` *neuron*
30+
:math:`b_{i}` *- bias for* :math:`i_{th}` *neuron*
3131

3232
This is a MAC-based kernel which implies accumulation. See :ref:`quant_accum_infl` for more information on related quantization aspects.
3333
The number of accumulation series is equal to total number of values in all inputs.

doc/documents/utility_functions/util_help_func.rst

Lines changed: 83 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,10 @@ getting information from data structures and performing various operations on th
1313
- :ref:`get_shift_val`
1414

1515
- :ref:`get_zero_offset_val`
16-
17-
- :ref:`point_sub_tensor`
18-
16+
17+
..
18+
- :ref:`point_sub_tensor`
19+
1920
- :ref:`num_of_accu_bits`
2021

2122

@@ -146,8 +147,8 @@ Get Scale Shift Value
146147
~~~~~~~~~~~~~~~~~~~~~
147148

148149
This function returns the shift value from the quantization parameters.
149-
For data formats that dont have a shift value, the value 0 is returned.
150-
For tensors with multiple scale values per-axis, the parameter``scale_idx``
150+
For data formats that don't have a shift value, the value 0 is returned.
151+
For tensors with multiple scale values per-axis, the parameter ``scale_idx``
151152
defines the particular scale shift value to be fetched.
152153

153154
Function prototype
@@ -223,79 +224,86 @@ Conditions:
223224
- zero_idx must be less or equal to number of zero offset values in the tensor
224225

225226
.. _point_sub_tensor:
226-
227-
Point to Sub-Tensor
228-
~~~~~~~~~~~~~~~~~~~
229-
230-
This function points to sub tensors in the input tensor. This function can
231-
be considered as indexing in a multidimensional array without copying or
232-
used to create a slice/fragment of the input tensor without copying the data.
233227

234-
For example, given a HWC tensor, this function could be used to create a HWC
235-
tensor for the top half of the HW image for all channels.
236-
237-
The configuration struct is defined as follows and the fields are explained in
238-
Table :ref:`t_mli_sub_tensor_cfg_desc`.
239-
240-
.. code:: c
241-
242-
typedef struct {
243-
uint32_t offset[MLI_MAX_RANK];
244-
uint32_t size[MLI_MAX_RANK];
245-
uint32_t sub_tensor_rank;
246-
} mli_sub_tensor_cfg;
247228
..
248-
249-
.. _t_mli_sub_tensor_cfg_desc:
250-
.. table:: mli_sub_tensor_cfg Structure Field Description
251-
:align: center
252-
:widths: auto
229+
Point to Sub-Tensor
230+
~~~~~~~~~~~~~~~~~~~
231+
232+
.. warning::
233+
234+
The interface of this function is subject to change. Avoid using it.
235+
236+
..
237+
238+
This function points to sub tensors in the input tensor. This function can
239+
be considered as indexing in a multidimensional array without copying or
240+
used to create a slice/fragment of the input tensor without copying the data.
241+
242+
For example, given a HWC tensor, this function could be used to create a HWC
243+
tensor for the top half of the HW image for all channels.
244+
245+
The configuration struct is defined as follows and the fields are explained in
246+
Table :ref:`t_mli_sub_tensor_cfg_desc`.
247+
248+
.. code:: c
249+
250+
typedef struct {
251+
uint32_t offset[MLI_MAX_RANK];
252+
uint32_t size[MLI_MAX_RANK];
253+
uint32_t sub_tensor_rank;
254+
} mli_sub_tensor_cfg;
255+
..
256+
257+
.. _t_mli_sub_tensor_cfg_desc:
258+
.. table:: mli_sub_tensor_cfg Structure Field Description
259+
:align: center
260+
:widths: auto
261+
262+
+---------------------+----------------+---------------------------------------------------------+
263+
| **Field Name** | **Type** | Description |
264+
+=====================+================+=========================================================+
265+
| | | Start coordinate in the input tensor. Values must |
266+
| ``offset`` | ``uint32_t[]`` | be smaller than the shape of the input tensor. Size |
267+
| | | of the array must be equal to the rank of the input |
268+
| | | tensor. |
269+
+---------------------+----------------+---------------------------------------------------------+
270+
| | | Size of the sub tensor in elements per dimension: |
271+
| ``size`` | ``uint32_t[]`` | |
272+
| | | Restrictions: Size[d] + offset[d] <= input->shape[d] |
273+
+---------------------+----------------+---------------------------------------------------------+
274+
| | | Rank of the sub tensor that is produced. Must be |
275+
| | | smaller or equal to the rank of the input tensor. If |
276+
| ``sub_tensor_rank`` | ``uint32_t`` | the ``sub_tensor_rank`` is smaller than the input rank, |
277+
| | | the dimensions with a size of 1 is removed in the |
278+
| | | output shape starting from the first dimension until |
279+
| | | the requested ``sub_tensor_rank`` value is reached. |
280+
+---------------------+----------------+---------------------------------------------------------+
281+
..
282+
283+
This function computes the new data pointer based on the offset vector and it sets
284+
the shape of the output tensor according to the size vector. The ``mem_stride`` fields
285+
are copied from the input to the output, so after this operation, the output tensor might
286+
not be a contiguous block of data.
287+
288+
The function also reduces the rank of the output tensor if requested by the
289+
configuration. Only the dimensions with a size of 1 can be removed. Data format and
290+
quantization parameters are copied from the input to the output tensor.
291+
292+
The capacity field of the output is the input capacity decremented with the same
293+
value as that used to increment the data pointer.
294+
295+
The function prototype:
296+
297+
.. code:: c
298+
299+
mli_status mli_hlp_subtensor(
300+
const mli_tensor *in,
301+
const mli_subtensor_cfg *cfg,
302+
mli_tensor *out);
303+
..
253304
254-
+---------------------+----------------+---------------------------------------------------------+
255-
| **Field Name** | **Type** | Description |
256-
+=====================+================+=========================================================+
257-
| | | Start coordinate in the input tensor. Values must |
258-
| ``offset`` | ``uint32_t[]`` | be smaller than the shape of the input tensor. Size |
259-
| | | of the array must be equal to the rank of the input |
260-
| | | tensor. |
261-
+---------------------+----------------+---------------------------------------------------------+
262-
| | | Size of the sub tensor in elements per dimension: |
263-
| ``size`` | ``uint32_t[]`` | |
264-
| | | Restrictions: Size[d] + offset[d] <= input->shape[d] |
265-
+---------------------+----------------+---------------------------------------------------------+
266-
| | | Rank of the sub tensor that is produced. Must be |
267-
| | | smaller or equal to the rank of the input tensor. If |
268-
| ``sub_tensor_rank`` | ``uint32_t`` | the ``sub_tensor_rank`` is smaller than the input rank, |
269-
| | | the dimensions with a size of 1 is removed in the |
270-
| | | output shape starting from the first dimension until |
271-
| | | the requested ``sub_tensor_rank`` value is reached. |
272-
+---------------------+----------------+---------------------------------------------------------+
273-
..
274-
275-
This function computes the new data pointer based on the offset vector and it sets
276-
the shape of the output tensor according to the size vector. The ``mem_stride`` fields
277-
are copied from the input to the output, so after this operation, the output tensor might
278-
not be a contiguous block of data.
279-
280-
The function also reduces the rank of the output tensor if requested by the
281-
configuration. Only the dimensions with a size of 1 can be removed. Data format and
282-
quantization parameters are copied from the input to the output tensor.
283-
284-
The capacity field of the output is the input capacity decremented with the same
285-
value as that used to increment the data pointer.
286-
287-
The function prototype:
288-
289-
.. code:: c
290-
291-
mli_status mli_hlp_subtensor(
292-
const mli_tensor *in,
293-
const mli_subtensor_cfg *cfg,
294-
mli_tensor *out);
295-
..
296-
297-
Depending on the debug level (see section :ref:`err_codes`), this function performs a parameter
298-
check and returns the result as an ``mli_status`` code as described in section :ref:`kernl_sp_conf`.
305+
Depending on the debug level (see section :ref:`err_codes`), this function performs a parameter
306+
check and returns the result as an ``mli_status`` code as described in section :ref:`kernl_sp_conf`.
299307

300308

301309
.. _num_of_accu_bits:

0 commit comments

Comments
 (0)