Skip to content

Commit 6d371e4

Browse files
authored
init Inference top APIs (#10549)
1 parent 13457ef commit 6d371e4

File tree

2 files changed

+96
-0
lines changed

2 files changed

+96
-0
lines changed

contrib/inference/README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Embed Paddle Inference in Your Application
2+
3+
Paddle inference offers the APIs in `C` and `C++` languages.
4+
5+
One can easily deploy a model trained by Paddle following the steps as below:
6+
7+
1. Optimize the native model;
8+
2. Write some codes for deployment.
9+
10+
11+
Let's explain the steps in detail.
12+
13+
## Optimize the native Fluid Model
14+
15+
The native model that get from the training phase needs to be optimized for that.
16+
17+
- Clean the noise such as the cost operators that do not need inference;
18+
- Prune unnecessary computation fork that has nothing to do with the output;
19+
- Remove extraneous variables;
20+
- Memory reuse for native Fluid executor;
21+
- Translate the model storage format to some third-party engine's, so that the inference API can utilize the engine for acceleration;
22+
23+
We have an official tool to do the optimization, call `paddle_inference_optimize --help` for more information.
24+
25+
## Write some codes
26+
27+
Read `paddle_inference_api.h` for more information.
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License");
4+
you may not use this file except in compliance with the License.
5+
You may obtain a copy of the License at
6+
7+
http://www.apache.org/licenses/LICENSE-2.0
8+
9+
Unless required by applicable law or agreed to in writing, software
10+
distributed under the License is distributed on an "AS IS" BASIS,
11+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
See the License for the specific language governing permissions and
13+
limitations under the License. */
14+
15+
#pragma once
16+
17+
#include <string>
18+
#include <vector>
19+
20+
namespace paddle {
21+
22+
class Predictor {
23+
public:
24+
struct Attr;
25+
Predictor() = default;
26+
27+
// Build the network before inference.
28+
bool Init(const Attr& attr);
29+
30+
// Predict an record.
31+
// Arguments:
32+
// inputs: the name of the input variables.
33+
// outputs: the name of the output varaibles.
34+
// input_shapes: the shape of the input variables.
35+
// output_shapes: the shape of the output variables.
36+
// input_data: the data of the input variables.
37+
// output_data: the data of the output variables.
38+
bool Run(const std::vector<std::string>& inputs,
39+
const std::vector<std::string>& outputs,
40+
const std::vector<std::vector<int>>& input_shapes,
41+
const std::vector<std::vector<int>>& output_shapes,
42+
const std::vector<std::vector<float>>& input_data,
43+
std::vector<std::vector<float>>* output_data);
44+
45+
// Clone a predictor that share the model weights.
46+
Predictor* Clone();
47+
48+
// Destroy the Predictor.
49+
~Predictor();
50+
51+
struct Attr {
52+
enum class EngineKind;
53+
54+
std::string model_dir; // path to the model directory.
55+
bool enable_engine{false}; // Enable to execute (part of) the model on
56+
// third-party engines.
57+
EngineKind engine_kind{Attr::EngineKind::kNone};
58+
59+
enum class EngineKind {
60+
kNone = -1, // Use the native Fluid facility.
61+
kAnakin, // Use Anakin for inference.
62+
kTensorRT, // Use TensorRT for inference.
63+
kAutoMixedAnakin, // Automatically mix Fluid with Anakin.
64+
kAutoMixedTensorRT, // Automatically mix Fluid with TensorRT.
65+
};
66+
};
67+
};
68+
69+
} // namespace paddle

0 commit comments

Comments
 (0)