openvino/src/inference/include/ie/ie_transformations.hpp at master · camsword/openvino · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
// Copyright (C) 2018-2023 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

/**
 * @brief This header file defines the list of public transformations.
 *
 * @file ie_transformations.hpp
 */

#pragma once

#include "cpp/ie_cnn_network.h"
#include "ie_api.h"

namespace InferenceEngine {

/**
 * @deprecated Use InferenceEngine::lowLatency2 instead. This transformation will be removed in 2023.1.
 * @brief The transformation finds all TensorIterator layers in the network, processes all back
 * edges that describe a connection between Result and Parameter of the TensorIterator body,
 * and inserts ReadValue layer between Parameter and the next layers after this Parameter,
 * and Assign layer after the layers before the Result layer.
 * Supported platforms: CPU, GNA.
 *
 *  The example below describes the changes to the inner part (body, back edges) of the TensorIterator layer.
 *  [] - TensorIterator body
 *  () - new layer
 *
 *  before applying the transformation:
 *  back_edge_1 -> [Parameter -> some layers ... -> Result ] -> back_edge_1
 *
 *  after applying the transformation:
 *  back_edge_1 -> [Parameter -> (ReadValue layer) -> some layers ... -> (Assign layer) ]
 *                                                              \
 *                                                               -> Result ] -> back_edge_1
 *
 *  It is recommended to use this transformation in conjunction with the Reshape feature to set sequence
 *  dimension to 1 and with the UnrollTensorIterator transformation.
 *  For convenience, we have already enabled the unconditional execution of the UnrollTensorIterator
 *  transformation when using the LowLatency transformation for CPU, GNA plugins, no action is required here.
 *  After applying both of these transformations, the resulting network can be inferred step by
 *  step, the states will store between inferences.
 *
 *    An illustrative example, not real API:
 *
 *    network->reshape(...) // Set sequence dimension to 1, recalculating shapes. Optional, depends on the network.
 *    LowLatency(network)   // Applying LowLatency and UnrollTensorIterator transformations.
 *    network->infer (...)  // Calculating new values for states.
 *    // All states are stored between inferences via Assign, ReadValue layers.
 *    network->infer (...)  // Using stored states, calculating new values for states.
 *
 * @param network A network to apply LowLatency transformation
 */
INFERENCE_ENGINE_DEPRECATED("This transformation will be removed in 2023.1. "
                            "Use InferenceEngine::lowLatency2 instead.")
INFERENCE_ENGINE_API_CPP(void) LowLatency(InferenceEngine::CNNNetwork& network);

/**
 * @brief The transformation finds all TensorIterator/Loop layers in the network,
 * processes all back edges that describe a connection between Result and Parameter
 * of the TensorIterator/Loop bodies,and inserts ReadValue and Assign layers at the
 * input and output corresponding to this back edge.
 * Supported platforms: CPU, GNA.
 *
 * The example below describes the changes made by the transformation
 *  [] - TensorIterator body
 *  () - new layer
 *  BE - back-edge
 *
 *  before applying the transformation:
 *  -> input1[BE_1 -> Parameter -> Layers ... -> Result  -> BE_1 ]output1->
 *
 *  after applying the transformation:
 *  ->(ReadValue)-> input1[BE_1 ->Parameter->Layers ...->Result->BE_1]output1 ->(Assign)
 *                                                                      \
 *                                                                       ->...
 * After applying the transformation, the resulting network can be inferred
 * step by step, the states will store between inferences.
 * @param network A network to apply LowLatency transformation
 * @param use_const_initializer Changes the type of the initializing subgraph for ReadValue operations.
          If "true", then the transformation inserts Constant before ReadValue operation.
          If "false, then the transformation leaves existed initializing subgraph for ReadValue operation.
 * Loop operation by a given number. Does not affect TensorIterators.
 */
INFERENCE_ENGINE_API_CPP(void) lowLatency2(InferenceEngine::CNNNetwork& network, bool use_const_initializer = true);
}  // namespace InferenceEngine