Skip to content

Commit bcea248

Browse files
authored
doc/inference api (#11332)
1 parent 4d8e8ee commit bcea248

File tree

2 files changed

+60
-2
lines changed

2 files changed

+60
-2
lines changed
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Inference High-level APIs
2+
This document describes the high-level inference APIs one can use to easily deploy a Paddle model for an application.
3+
4+
The APIs are described in `paddle_inference_api.h`, just one header file, and two libaries `libpaddle_fluid.so` and `libpaddle_fluid_api.so` are needed.
5+
6+
## PaddleTensor
7+
We provide the `PaddleTensor` data structure is to give a general tensor interface.
8+
9+
The definition is
10+
11+
```c++
12+
struct PaddleTensor {
13+
std::string name; // variable name.
14+
std::vector<int> shape;
15+
PaddleBuf data; // blob of data.
16+
PaddleDType dtype;
17+
};
18+
```
19+
20+
The data is stored in a continuous memory `PaddleBuf`, and tensor's data type is specified by a `PaddleDType`.
21+
The `name` field is used to specify the name of input variable,
22+
that is important when there are multiple inputs and need to distiuish which variable to set.
23+
24+
## engine
25+
The inference APIs has two different underlying implementation, currently there are two valid engines:
26+
27+
- the native engine, which is consists of the native operators and framework,
28+
- the Anakin engine, which is a Anakin library embeded.
29+
30+
The native engine takes a native Paddle model as input, and supports any model that trained by Paddle,
31+
but the Anakin engine can only take the Anakin model as input(user need to manully transform the format first) and currently not all Paddle models are supported.
32+
33+
```c++
34+
enum class PaddleEngineKind {
35+
kNative = 0, // Use the native Fluid facility.
36+
kAnakin, // Use Anakin for inference.
37+
};
38+
```
39+
40+
## PaddlePredictor and how to create one
41+
The main interface is `PaddlePredictor`, there are following methods
42+
43+
- `bool Run(const std::vector<PaddleTensor>& inputs, std::vector<PaddleTensor>* output_data)`
44+
- take inputs and output `output_data`
45+
- `Clone` to clone a predictor from an existing one, with model parameter shared.
46+
47+
There is a factory method to help create a predictor, and the user takes the ownership of this object.
48+
49+
```c++
50+
template <typename ConfigT, PaddleEngineKind engine = PaddleEngineKind::kNative>
51+
std::unique_ptr<PaddlePredictor> CreatePaddlePredictor(const ConfigT& config);
52+
```
53+
54+
By specifying the engine kind and config, one can get an specific implementation.
55+
56+
## Reference
57+
58+
- [paddle_inference_api.h](./paddle_inference_api.h)
59+
- [demos](./demo)

paddle/contrib/inference/paddle_inference_api.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,7 @@ class PaddlePredictor {
109109

110110
// The common configs for all the predictors.
111111
struct Config {
112-
std::string model_dir; // path to the model directory.
113-
bool enable_engine{false}; // Enable to execute (part of) the model on
112+
std::string model_dir; // path to the model directory.
114113
};
115114
};
116115

0 commit comments

Comments
 (0)