Skip to content

Commit 6cf6e25

Browse files
committed
Add inference documentation
1 parent 35e5563 commit 6cf6e25

File tree

3 files changed

+362
-1
lines changed

3 files changed

+362
-1
lines changed

doc/fluid/howto/index_cn.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,6 @@
33

44
.. toctree::
55
:maxdepth: 1
6-
6+
77
optimization/index_cn.rst
8+
inference/inference_support_in_fluid.md

doc/fluid/howto/index_en.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ HOW TO
55
:maxdepth: 1
66

77
optimization/index_en.rst
8+
inference/inference_support_in_fluid.md
Lines changed: 359 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,359 @@
1+
# Fluid Inference使用指南
2+
3+
- Python Inference API
4+
- 编译Fluid Inference库
5+
- Inference C++ API
6+
- Inference实例
7+
- Inference计算优化
8+
9+
## Python Inference API **[改进中]**
10+
- [保存Inference模型](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L295)
11+
12+
```python
13+
def save_inference_model(dirname,
14+
feeded_var_names,
15+
target_vars,
16+
executor,
17+
main_program=None,
18+
model_filename=None,
19+
params_filename=None):
20+
```
21+
Inference模型和参数将会保存到`dirname`目录下:
22+
- 序列化的模型
23+
- `model_filename``None`,保存到`dirname/__model__`
24+
- `model_filename``None`,保存到`dirname/model_filename`
25+
- 参数
26+
- `params_filename``None`,单独保存到各个独立的文件,各文件以参数变量的名字命名
27+
- `params_filename``None`,保存到`dirname/params_filename`
28+
29+
- 两种存储格式
30+
- 参数保存到各个独立的文件
31+
- 如,设置`model_filename``None``params_filename``None`
32+
33+
```bash
34+
$ cd recognize_digits_conv.inference.model
35+
$ ls
36+
$ __model__ batch_norm_1.w_0 batch_norm_1.w_2 conv2d_2.w_0 conv2d_3.w_0 fc_1.w_0 batch_norm_1.b_0 batch_norm_1.w_1 conv2d_2.b_0 conv2d_3.b_0 fc_1.b_0
37+
```
38+
- 参数保存到同一个文件
39+
- 如,设置`model_filename``None``params_filename``__params__`
40+
41+
```bash
42+
$ cd recognize_digits_conv.inference.model
43+
$ ls
44+
$ __model__ __params__
45+
```
46+
- [加载Inference模型](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L380)
47+
```python
48+
def load_inference_model(dirname,
49+
executor,
50+
model_filename=None,
51+
params_filename=None):
52+
...
53+
return [program, feed_target_names, fetch_targets]
54+
```
55+
56+
57+
## 编译Fluid Inference库
58+
59+
- **不需要额外的CMake选项**
60+
- 1、 配置CMake命令,更多配置请参考[源码编译PaddlePaddle](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/build_from_source_cn.html)
61+
```bash
62+
$ git clone https://github.com/PaddlePaddle/Paddle.git
63+
$ cd Paddle
64+
$ mkdir build
65+
$ cd build
66+
$ cmake -DCMAKE_INSTALL_PREFIX=your/path/to/paddle_inference_lib \
67+
-DCMAKE_BUILD_TYPE=Release \
68+
-DWITH_PYTHON=ON \
69+
-DWITH_MKL=OFF \
70+
-DWITH_GPU=OFF \
71+
..
72+
```
73+
74+
- 2、 编译PaddlePaddle
75+
```bash
76+
$ make
77+
```
78+
79+
- 3、 部署。执行如下命令将PaddlePaddle Fluid Inference库部署到`your/path/to/paddle_inference_lib`目录。
80+
```bash
81+
$ make inference_lib_dist
82+
```
83+
84+
- 目录结构
85+
86+
```bash
87+
$ cd your/path/to/paddle_inference_lib
88+
$ tree
89+
.
90+
|-- paddle
91+
| `-- fluid
92+
| |-- framework
93+
| |-- inference
94+
| | |-- io.h
95+
| | `-- libpaddle_fluid.so
96+
| |-- memory
97+
| |-- platform
98+
| `-- string
99+
|-- third_party
100+
| |-- eigen3
101+
| `-- install
102+
| |-- gflags
103+
| |-- glog
104+
| `-- protobuf
105+
`-- ...
106+
```
107+
108+
假设`PADDLE_ROOT=your/path/to/paddle_inference_lib`
109+
110+
111+
112+
## 链接Fluid Inference库
113+
- [示例项目](https://github.com/luotao1/fluid_inference_example.git)
114+
115+
- GCC配置
116+
```bash
117+
$ g++ -o a.out -std=c++11 main.cc \
118+
-I${PADDLE_ROOT}/ \
119+
-I${PADDLE_ROOT}/third_party/install/gflags/include \
120+
-I${PADDLE_ROOT}/third_party/install/glog/include \
121+
-I${PADDLE_ROOT}/third_party/install/protobuf/include \
122+
-I${PADDLE_ROOT}/third_party/eigen3 \
123+
-L${PADDLE_ROOT}/paddle/fluid/inference -lpaddle_fluid \
124+
-lrt -ldl -lpthread
125+
```
126+
127+
- CMake配置
128+
```cmake
129+
include_directories(${PADDLE_ROOT}/)
130+
include_directories(${PADDLE_ROOT}/third_party/install/gflags/include)
131+
include_directories(${PADDLE_ROOT}/third_party/install/glog/include)
132+
include_directories(${PADDLE_ROOT}/third_party/install/protobuf/include)
133+
include_directories(${PADDLE_ROOT}/third_party/eigen3)
134+
target_link_libraries(${TARGET_NAME}
135+
${PADDLE_ROOT}/paddle/fluid/inference/libpaddle_fluid.so
136+
-lrt -ldl -lpthread)
137+
```
138+
139+
- 设置环境变量:
140+
`export LD_LIBRARY_PATH=${PADDLE_ROOT}/paddle/fluid/inference:$LD_LIBRARY_PATH`
141+
142+
143+
144+
## C++ Inference API
145+
146+
- [推断流程](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_helper.h#L91)
147+
148+
- 1、 初始化设备
149+
```cpp
150+
#include "paddle/fluid/framework/init.h"
151+
paddle::framework::InitDevices(false);
152+
```
153+
154+
- 2、 定义place,executor,scope
155+
```cpp
156+
auto place = paddle::platform::CPUPlace();
157+
auto executor = paddle::framework::Executor(place);
158+
auto* scope = new paddle::framework::Scope();
159+
```
160+
161+
- 3、 加载模型
162+
```cpp
163+
#include "paddle/fluid/inference/io.h"
164+
auto inference_program = paddle::inference::Load(executor, *scope, dirname);
165+
// or
166+
auto inference_program = paddle::inference::Load(executor,
167+
*scope,
168+
dirname + "/" + model_filename,
169+
dirname + "/" + params_filename);
170+
```
171+
172+
- 4、 获取`feed_target_names``fetch_target_names`
173+
```cpp
174+
const std::vector<std::string>& feed_target_names = inference_program->GetFeedTargetNames();
175+
const std::vector<std::string>& fetch_target_names = inference_program->GetFetchTargetNames();
176+
```
177+
178+
- 5、 准备`feed`数据
179+
```cpp
180+
#include "paddle/fluid/framework/lod_tensor.h"
181+
std::vector<paddle::framework::LoDTensor*> cpu_feeds;
182+
...
183+
std::map<std::string, const paddle::framework::LoDTensor*> feed_targets;
184+
for (size_t i = 0; i < feed_target_names.size(); ++i) {
185+
// Please make sure that cpu_feeds[i] is right for feed_target_names[i]
186+
feed_targets[feed_target_names[i]] = cpu_feeds[i];
187+
}
188+
```
189+
190+
- 6、 定义`Tensor``fetch`结果
191+
```cpp
192+
std::vector<paddle::framework::LoDTensor*> cpu_fetchs;
193+
std::map<std::string, paddle::framework::LoDTensor*> fetch_targets;
194+
for (size_t i = 0; i < fetch_target_names.size(); ++i) {
195+
fetch_targets[fetch_target_names[i]] = cpu_fetchs[i];
196+
}
197+
```
198+
199+
- 7、 执行`inference_program`
200+
```cpp
201+
executor.Run(*inference_program, scope, feed_targets, fetch_targets);
202+
```
203+
204+
- 8、 使用`fetch`数据
205+
```cpp
206+
for (size_t i = 0; i < cpu_fetchs.size(); ++i) {
207+
std::cout << "lod_i: " << cpu_fetchs[i]->lod();
208+
std::cout << "dims_i: " << cpu_fetchs[i]->dims();
209+
std::cout << "result:";
210+
float* output_ptr = cpu_fetchs[i]->data<float>();
211+
for (int j = 0; j < cpu_fetchs[i]->numel(); ++j) {
212+
std::cout << " " << output_ptr[j];
213+
}
214+
std::cout << std::endl;
215+
}
216+
```
217+
针对不同的数据,4. - 8.可执行多次。
218+
219+
- 9、 释放内存
220+
```cpp
221+
delete scope;
222+
```
223+
224+
225+
- 接口说明
226+
227+
```cpp
228+
void Run(const ProgramDesc& program, Scope* scope,
229+
std::map<std::string, const LoDTensor*>& feed_targets,
230+
std::map<std::string, LoDTensor*>& fetch_targets,
231+
bool create_vars = true,
232+
const std::string& feed_holder_name = "feed",
233+
const std::string& fetch_holder_name = "fetch");
234+
```
235+
- 使用Python API `save_inference_model`保存的`program`里面包含了`feed_op``fetch_op`,用户提供的`feed_targets``fetch_targets`必须和`inference_program`中的`feed_op``fetch_op`保持一致。
236+
- 用户提供的`feed_holder_name``fetch_holder_name`也必须和`inference_program``feed_op``fetch_op`保持一致,可使用`SetFeedHolderName``SetFetchHolderName`接口重新设置`inferece_program`
237+
- 默认情况下,除了`persistable`属性设置为`True``Variable`之外,每次执行`executor.Run`会创建一个局部`Scope`,并且在这个局部`Scope`中创建和销毁所有的`Variable`,以最小化空闲时的内存占用。
238+
- `persistable`属性为`True``Variable`有:
239+
- Operators的参数`w``b`
240+
- `feed_op`的输入变量
241+
- `fetch_op`的输出变量
242+
243+
244+
- **不在每次执行时创建和销毁变量
245+
[PR](https://github.com/PaddlePaddle/Paddle/pull/9301)**
246+
- 执行`inference_program`
247+
```cpp
248+
// Call once
249+
executor.CreateVariables(*inference_program, scope, 0);
250+
// Call as many times as you like
251+
executor.Run(
252+
*inference_program, scope, feed_targets, fetch_targets, false);
253+
```
254+
- **优点**
255+
- 节省了频繁创建、销毁变量的时间(约占每次`Run`总时间的1% ~ 12%)
256+
- 执行结束后可获取所有Operators的计算结果
257+
- **缺点**
258+
- 空闲时也会占用大量的内存
259+
- 在同一个`Scope`中,相同的变量名是公用同一块内存的,容易引起意想不到的错误
260+
261+
262+
- **不在每次执行时创建Op [PR](https://github.com/PaddlePaddle/Paddle/pull/9630)**
263+
- 执行`inference_program`
264+
```cpp
265+
// Call once
266+
auto ctx = executor.Prepare(*inference_program, 0);
267+
// Call as many times as you like if you have no need to change the inference_program
268+
executor.RunPreparedContext(ctx.get(), scope, feed_targets, fetch_targets);
269+
```
270+
- **优点**
271+
- 节省了频繁创建、销毁Op的时间
272+
- **缺点**
273+
- 一旦修改了`inference_program`,则需要重新创建`ctx`
274+
275+
276+
- **[多线程共享Parameters](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_multi_thread_helper.h)**
277+
- 主线程
278+
- 1、 初始化设备
279+
- 2、 定义`place``executor``scope`
280+
- 3、 加载模型,得到`inference_program`
281+
- 从线程
282+
- **复制`inference_program`得到`copy_program`,修改`copy_program``feed_holder_name``fetch_holder_name`**
283+
```cpp
284+
auto copy_program = std::unique_ptr<paddle::framework::ProgramDesc>(
285+
new paddle::framework::ProgramDesc(*inference_program));
286+
std::string feed_holder_name = "feed_" + paddle::string::to_string(thread_id);
287+
std::string fetch_holder_name = "fetch_" + paddle::string::to_string(thread_id);
288+
copy_program->SetFeedHolderName(feed_holder_name);
289+
copy_program->SetFetchHolderName(fetch_holder_name);
290+
```
291+
- 4、 获取`copy_program``feed_target_names``fetch_target_names`
292+
- 5、 准备feed数据,定义Tensor来fetch结果
293+
- 6、 执行`copy_program`
294+
```cpp
295+
executor->Run(*copy_program, scope, feed_targets, fetch_targets, true, feed_holder_name, fetch_holder_name);
296+
```
297+
- 7、 使用fetch数据
298+
- 主线程
299+
- 8、 释放资源
300+
301+
302+
- 基本概念
303+
- 数据相关:
304+
- [Tensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/tensor.md),一个N维数组,数据可以是任意类型(int,float,double等)
305+
- [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/lod_tensor.md),带LoD(Level-of-Detail)即序列信息的Tensor
306+
- [Scope](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md),记录了变量Variable
307+
- 执行相关:
308+
- [Executor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/executor.md),无状态执行器,只跟设备相关
309+
- Place
310+
- CPUPlace,CPU设备
311+
- CUDAPlace,CUDA GPU设备
312+
- 神经网络表示:
313+
- [Program](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/program.md)
314+
315+
详细介绍请参考[**Paddle Fluid开发者指南**](https://github.com/lcy-seso/learning_notes/blob/master/Fluid/developer's_guid_for_Fluid/Developer's_Guide_to_Paddle_Fluid.md)
316+
317+
318+
319+
## Inference实例
320+
321+
1. fit a line: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_fit_a_line.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_fit_a_line.cc)
322+
1. image classification: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_image_classification.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_image_classification.cc)
323+
1. label semantic roles: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_label_semantic_roles.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_label_semantic_roles.cc)
324+
1. recognize digits: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recognize_digits.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recognize_digits.cc)
325+
1. recommender system: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recommender_system.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recommender_system.cc)
326+
1. understand sentiment: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_understand_sentiment.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_understand_sentiment.cc)
327+
1. word2vec: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_word2vec.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_word2vec.cc)
328+
329+
330+
## Inference计算优化
331+
- 使用Python推理优化工具[inference_transpiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/inference_transpiler.py)
332+
```python
333+
class InferenceTranspiler:
334+
def transpile(self, program, place, scope=None):
335+
...
336+
if scope is None:
337+
scope = global_scope()
338+
...
339+
```
340+
- 使用`InferenceTranspiler`将会直接修改`program`
341+
- 使用`InferenceTranspiler`会修改参数的值,请确保`program`的参数在`scope`内。
342+
- 支持的优化
343+
- 融合batch_norm op的计算
344+
- [使用示例](https://github.com/Xreki/Xreki.github.io/blob/master/fluid/inference/inference_transpiler.py)
345+
```python
346+
import paddle.fluid as fluid
347+
# NOTE: Applying the inference transpiler will change the inference_program.
348+
t = fluid.InferenceTranspiler()
349+
t.transpile(inference_program, place, inference_scope)
350+
```
351+
352+
353+
354+
355+
## 内存使用优化
356+
- 使用Python内存优化工具[memory_optimization_transipiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/memory_optimization_transpiler.py)
357+
```python
358+
fluid.memory_optimize(inference_program)
359+
```

0 commit comments

Comments
 (0)