|
| 1 | +# Inference High-level APIs |
| 2 | +This document describes the high-level inference APIs one can use to easily deploy a Paddle model for an application. |
| 3 | + |
| 4 | +The APIs are described in `paddle_inference_api.h`, just one header file, and two libaries `libpaddle_fluid.so` and `libpaddle_fluid_api.so` are needed. |
| 5 | + |
| 6 | +## PaddleTensor |
| 7 | +We provide the `PaddleTensor` data structure is to give a general tensor interface. |
| 8 | + |
| 9 | +The definition is |
| 10 | + |
| 11 | +```c++ |
| 12 | +struct PaddleTensor { |
| 13 | + std::string name; // variable name. |
| 14 | + std::vector<int> shape; |
| 15 | + PaddleBuf data; // blob of data. |
| 16 | + PaddleDType dtype; |
| 17 | +}; |
| 18 | +``` |
| 19 | +
|
| 20 | +The data is stored in a continuous memory `PaddleBuf`, and tensor's data type is specified by a `PaddleDType`. |
| 21 | +The `name` field is used to specify the name of input variable, |
| 22 | +that is important when there are multiple inputs and need to distiuish which variable to set. |
| 23 | +
|
| 24 | +## engine |
| 25 | +The inference APIs has two different underlying implementation, currently there are two valid engines: |
| 26 | +
|
| 27 | +- the native engine, which is consists of the native operators and framework, |
| 28 | +- the Anakin engine, which is a Anakin library embeded. |
| 29 | +
|
| 30 | +The native engine takes a native Paddle model as input, and supports any model that trained by Paddle, |
| 31 | +but the Anakin engine can only take the Anakin model as input(user need to manully transform the format first) and currently not all Paddle models are supported. |
| 32 | +
|
| 33 | +```c++ |
| 34 | +enum class PaddleEngineKind { |
| 35 | + kNative = 0, // Use the native Fluid facility. |
| 36 | + kAnakin, // Use Anakin for inference. |
| 37 | +}; |
| 38 | +``` |
| 39 | + |
| 40 | +## PaddlePredictor and how to create one |
| 41 | +The main interface is `PaddlePredictor`, there are following methods |
| 42 | + |
| 43 | +- `bool Run(const std::vector<PaddleTensor>& inputs, std::vector<PaddleTensor>* output_data)` |
| 44 | + - take inputs and output `output_data` |
| 45 | +- `Clone` to clone a predictor from an existing one, with model parameter shared. |
| 46 | + |
| 47 | +There is a factory method to help create a predictor, and the user takes the ownership of this object. |
| 48 | + |
| 49 | +```c++ |
| 50 | +template <typename ConfigT, PaddleEngineKind engine = PaddleEngineKind::kNative> |
| 51 | +std::unique_ptr<PaddlePredictor> CreatePaddlePredictor(const ConfigT& config); |
| 52 | +``` |
| 53 | +
|
| 54 | +By specifying the engine kind and config, one can get an specific implementation. |
| 55 | +
|
| 56 | +## Reference |
| 57 | +
|
| 58 | +- [paddle_inference_api.h](./paddle_inference_api.h) |
| 59 | +- [demos](./demo) |
0 commit comments