Skip to content

Commit 728621e

Browse files
authored
Merge pull request #10858 from weixing02/inference
Add Inference doc for fluid
2 parents a675289 + 68b2d09 commit 728621e

File tree

3 files changed

+364
-1
lines changed

3 files changed

+364
-1
lines changed

doc/fluid/howto/index_cn.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,6 @@
33

44
.. toctree::
55
:maxdepth: 1
6-
6+
77
optimization/index_cn.rst
8+
inference/inference_support_in_fluid.md

doc/fluid/howto/index_en.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ HOW TO
55
:maxdepth: 1
66

77
optimization/index_en.rst
8+
inference/inference_support_in_fluid.md
Lines changed: 361 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,361 @@
1+
# Fluid Inference使用指南
2+
3+
## 目录:
4+
5+
- Python Inference API
6+
- 编译Fluid Inference库
7+
- Inference C++ API
8+
- Inference实例
9+
- Inference计算优化
10+
11+
## Python Inference API **[改进中]**
12+
- 保存Inference模型 ([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L295))
13+
14+
```python
15+
def save_inference_model(dirname,
16+
feeded_var_names,
17+
target_vars,
18+
executor,
19+
main_program=None,
20+
model_filename=None,
21+
params_filename=None):
22+
```
23+
Inference模型和参数将会保存到`dirname`目录下:
24+
- 序列化的模型
25+
- `model_filename``None`,保存到`dirname/__model__`
26+
- `model_filename``None`,保存到`dirname/model_filename`
27+
- 参数
28+
- `params_filename``None`,单独保存到各个独立的文件,各文件以参数变量的名字命名
29+
- `params_filename``None`,保存到`dirname/params_filename`
30+
31+
- 两种存储格式
32+
- 参数保存到各个独立的文件
33+
- 如,设置`model_filename``None``params_filename``None`
34+
35+
```bash
36+
$ cd recognize_digits_conv.inference.model
37+
$ ls
38+
$ __model__ batch_norm_1.w_0 batch_norm_1.w_2 conv2d_2.w_0 conv2d_3.w_0 fc_1.w_0 batch_norm_1.b_0 batch_norm_1.w_1 conv2d_2.b_0 conv2d_3.b_0 fc_1.b_0
39+
```
40+
- 参数保存到同一个文件
41+
- 如,设置`model_filename``None``params_filename``__params__`
42+
43+
```bash
44+
$ cd recognize_digits_conv.inference.model
45+
$ ls
46+
$ __model__ __params__
47+
```
48+
- 加载Inference模型([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L380))
49+
```python
50+
def load_inference_model(dirname,
51+
executor,
52+
model_filename=None,
53+
params_filename=None):
54+
...
55+
return [program, feed_target_names, fetch_targets]
56+
```
57+
58+
59+
## 编译Fluid Inference库
60+
61+
- **不需要额外的CMake选项**
62+
- 1、 配置CMake命令,更多配置请参考[源码编译PaddlePaddle](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/build_from_source_cn.html)
63+
```bash
64+
$ git clone https://github.com/PaddlePaddle/Paddle.git
65+
$ cd Paddle
66+
$ mkdir build
67+
$ cd build
68+
$ cmake -DCMAKE_INSTALL_PREFIX=your/path/to/paddle_inference_lib \
69+
-DCMAKE_BUILD_TYPE=Release \
70+
-DWITH_PYTHON=ON \
71+
-DWITH_MKL=OFF \
72+
-DWITH_GPU=OFF \
73+
..
74+
```
75+
76+
- 2、 编译PaddlePaddle
77+
```bash
78+
$ make
79+
```
80+
81+
- 3、 部署。执行如下命令将PaddlePaddle Fluid Inference库部署到`your/path/to/paddle_inference_lib`目录。
82+
```bash
83+
$ make inference_lib_dist
84+
```
85+
86+
- 目录结构
87+
88+
```bash
89+
$ cd your/path/to/paddle_inference_lib
90+
$ tree
91+
.
92+
|-- paddle
93+
| `-- fluid
94+
| |-- framework
95+
| |-- inference
96+
| | |-- io.h
97+
| | `-- libpaddle_fluid.so
98+
| |-- memory
99+
| |-- platform
100+
| `-- string
101+
|-- third_party
102+
| |-- eigen3
103+
| `-- install
104+
| |-- gflags
105+
| |-- glog
106+
| `-- protobuf
107+
`-- ...
108+
```
109+
110+
假设`PADDLE_ROOT=your/path/to/paddle_inference_lib`
111+
112+
113+
114+
## 链接Fluid Inference库
115+
- 示例项目([链接](https://github.com/luotao1/fluid_inference_example.git))
116+
117+
- GCC配置
118+
```bash
119+
$ g++ -o a.out -std=c++11 main.cc \
120+
-I${PADDLE_ROOT}/ \
121+
-I${PADDLE_ROOT}/third_party/install/gflags/include \
122+
-I${PADDLE_ROOT}/third_party/install/glog/include \
123+
-I${PADDLE_ROOT}/third_party/install/protobuf/include \
124+
-I${PADDLE_ROOT}/third_party/eigen3 \
125+
-L${PADDLE_ROOT}/paddle/fluid/inference -lpaddle_fluid \
126+
-lrt -ldl -lpthread
127+
```
128+
129+
- CMake配置
130+
```cmake
131+
include_directories(${PADDLE_ROOT}/)
132+
include_directories(${PADDLE_ROOT}/third_party/install/gflags/include)
133+
include_directories(${PADDLE_ROOT}/third_party/install/glog/include)
134+
include_directories(${PADDLE_ROOT}/third_party/install/protobuf/include)
135+
include_directories(${PADDLE_ROOT}/third_party/eigen3)
136+
target_link_libraries(${TARGET_NAME}
137+
${PADDLE_ROOT}/paddle/fluid/inference/libpaddle_fluid.so
138+
-lrt -ldl -lpthread)
139+
```
140+
141+
- 设置环境变量:
142+
`export LD_LIBRARY_PATH=${PADDLE_ROOT}/paddle/fluid/inference:$LD_LIBRARY_PATH`
143+
144+
145+
146+
## C++ Inference API
147+
148+
- 推断流程([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_helper.h#L91))
149+
150+
- 1、 初始化设备
151+
```cpp
152+
#include "paddle/fluid/framework/init.h"
153+
paddle::framework::InitDevices(false);
154+
```
155+
156+
- 2、 定义place,executor,scope
157+
```cpp
158+
auto place = paddle::platform::CPUPlace();
159+
auto executor = paddle::framework::Executor(place);
160+
auto* scope = new paddle::framework::Scope();
161+
```
162+
163+
- 3、 加载模型
164+
```cpp
165+
#include "paddle/fluid/inference/io.h"
166+
auto inference_program = paddle::inference::Load(executor, *scope, dirname);
167+
// or
168+
auto inference_program = paddle::inference::Load(executor,
169+
*scope,
170+
dirname + "/" + model_filename,
171+
dirname + "/" + params_filename);
172+
```
173+
174+
- 4、 获取`feed_target_names``fetch_target_names`
175+
```cpp
176+
const std::vector<std::string>& feed_target_names = inference_program->GetFeedTargetNames();
177+
const std::vector<std::string>& fetch_target_names = inference_program->GetFetchTargetNames();
178+
```
179+
180+
- 5、 准备`feed`数据
181+
```cpp
182+
#include "paddle/fluid/framework/lod_tensor.h"
183+
std::vector<paddle::framework::LoDTensor*> cpu_feeds;
184+
...
185+
std::map<std::string, const paddle::framework::LoDTensor*> feed_targets;
186+
for (size_t i = 0; i < feed_target_names.size(); ++i) {
187+
// Please make sure that cpu_feeds[i] is right for feed_target_names[i]
188+
feed_targets[feed_target_names[i]] = cpu_feeds[i];
189+
}
190+
```
191+
192+
- 6、 定义`Tensor``fetch`结果
193+
```cpp
194+
std::vector<paddle::framework::LoDTensor*> cpu_fetchs;
195+
std::map<std::string, paddle::framework::LoDTensor*> fetch_targets;
196+
for (size_t i = 0; i < fetch_target_names.size(); ++i) {
197+
fetch_targets[fetch_target_names[i]] = cpu_fetchs[i];
198+
}
199+
```
200+
201+
- 7、 执行`inference_program`
202+
```cpp
203+
executor.Run(*inference_program, scope, feed_targets, fetch_targets);
204+
```
205+
206+
- 8、 使用`fetch`数据
207+
```cpp
208+
for (size_t i = 0; i < cpu_fetchs.size(); ++i) {
209+
std::cout << "lod_i: " << cpu_fetchs[i]->lod();
210+
std::cout << "dims_i: " << cpu_fetchs[i]->dims();
211+
std::cout << "result:";
212+
float* output_ptr = cpu_fetchs[i]->data<float>();
213+
for (int j = 0; j < cpu_fetchs[i]->numel(); ++j) {
214+
std::cout << " " << output_ptr[j];
215+
}
216+
std::cout << std::endl;
217+
}
218+
```
219+
针对不同的数据,4. - 8.可执行多次。
220+
221+
- 9、 释放内存
222+
```cpp
223+
delete scope;
224+
```
225+
226+
227+
- 接口说明
228+
229+
```cpp
230+
void Run(const ProgramDesc& program, Scope* scope,
231+
std::map<std::string, const LoDTensor*>& feed_targets,
232+
std::map<std::string, LoDTensor*>& fetch_targets,
233+
bool create_vars = true,
234+
const std::string& feed_holder_name = "feed",
235+
const std::string& fetch_holder_name = "fetch");
236+
```
237+
- 使用Python API `save_inference_model`保存的`program`里面包含了`feed_op``fetch_op`,用户提供的`feed_targets``fetch_targets`必须和`inference_program`中的`feed_op``fetch_op`保持一致。
238+
- 用户提供的`feed_holder_name``fetch_holder_name`也必须和`inference_program``feed_op``fetch_op`保持一致,可使用`SetFeedHolderName``SetFetchHolderName`接口重新设置`inferece_program`
239+
- 默认情况下,除了`persistable`属性设置为`True``Variable`之外,每次执行`executor.Run`会创建一个局部`Scope`,并且在这个局部`Scope`中创建和销毁所有的`Variable`,以最小化空闲时的内存占用。
240+
- `persistable`属性为`True``Variable`有:
241+
- Operators的参数`w``b`
242+
- `feed_op`的输入变量
243+
- `fetch_op`的输出变量
244+
245+
246+
- **不在每次执行时创建和销毁变量
247+
([PR](https://github.com/PaddlePaddle/Paddle/pull/9301))**
248+
- 执行`inference_program`
249+
```cpp
250+
// Call once
251+
executor.CreateVariables(*inference_program, scope, 0);
252+
// Call as many times as you like
253+
executor.Run(
254+
*inference_program, scope, feed_targets, fetch_targets, false);
255+
```
256+
- **优点**
257+
- 节省了频繁创建、销毁变量的时间(约占每次`Run`总时间的1% ~ 12%)
258+
- 执行结束后可获取所有Operators的计算结果
259+
- **缺点**
260+
- 空闲时也会占用大量的内存
261+
- 在同一个`Scope`中,相同的变量名是公用同一块内存的,容易引起意想不到的错误
262+
263+
264+
- **不在每次执行时创建Op([PR](https://github.com/PaddlePaddle/Paddle/pull/9630))**
265+
- 执行`inference_program`
266+
```cpp
267+
// Call once
268+
auto ctx = executor.Prepare(*inference_program, 0);
269+
// Call as many times as you like if you have no need to change the inference_program
270+
executor.RunPreparedContext(ctx.get(), scope, feed_targets, fetch_targets);
271+
```
272+
- **优点**
273+
- 节省了频繁创建、销毁Op的时间
274+
- **缺点**
275+
- 一旦修改了`inference_program`,则需要重新创建`ctx`
276+
277+
278+
- **多线程共享Parameters([链接](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_multi_thread_helper.h))**
279+
- 主线程
280+
- 1、 初始化设备
281+
- 2、 定义`place``executor``scope`
282+
- 3、 加载模型,得到`inference_program`
283+
- 从线程
284+
- **复制`inference_program`得到`copy_program`,修改`copy_program``feed_holder_name``fetch_holder_name`**
285+
```cpp
286+
auto copy_program = std::unique_ptr<paddle::framework::ProgramDesc>(
287+
new paddle::framework::ProgramDesc(*inference_program));
288+
std::string feed_holder_name = "feed_" + paddle::string::to_string(thread_id);
289+
std::string fetch_holder_name = "fetch_" + paddle::string::to_string(thread_id);
290+
copy_program->SetFeedHolderName(feed_holder_name);
291+
copy_program->SetFetchHolderName(fetch_holder_name);
292+
```
293+
- 4、 获取`copy_program``feed_target_names``fetch_target_names`
294+
- 5、 准备feed数据,定义Tensor来fetch结果
295+
- 6、 执行`copy_program`
296+
```cpp
297+
executor->Run(*copy_program, scope, feed_targets, fetch_targets, true, feed_holder_name, fetch_holder_name);
298+
```
299+
- 7、 使用fetch数据
300+
- 主线程
301+
- 8、 释放资源
302+
303+
304+
- 基本概念
305+
- 数据相关:
306+
- [Tensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/tensor.md),一个N维数组,数据可以是任意类型(int,float,double等)
307+
- [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/lod_tensor.md),带LoD(Level-of-Detail)即序列信息的Tensor
308+
- [Scope](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md),记录了变量Variable
309+
- 执行相关:
310+
- [Executor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/executor.md),无状态执行器,只跟设备相关
311+
- Place
312+
- CPUPlace,CPU设备
313+
- CUDAPlace,CUDA GPU设备
314+
- 神经网络表示:
315+
- [Program](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/program.md).
316+
317+
详细介绍请参考[**Paddle Fluid开发者指南**](https://github.com/lcy-seso/learning_notes/blob/master/Fluid/developer's_guid_for_Fluid/Developer's_Guide_to_Paddle_Fluid.md)
318+
319+
320+
321+
## Inference实例
322+
323+
1. fit a line: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_fit_a_line.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_fit_a_line.cc)
324+
1. image classification: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_image_classification.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_image_classification.cc)
325+
1. label semantic roles: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_label_semantic_roles.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_label_semantic_roles.cc)
326+
1. recognize digits: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recognize_digits.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recognize_digits.cc)
327+
1. recommender system: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recommender_system.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recommender_system.cc)
328+
1. understand sentiment: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_understand_sentiment.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_understand_sentiment.cc)
329+
1. word2vec: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_word2vec.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_word2vec.cc)
330+
331+
332+
## Inference计算优化
333+
- 使用Python推理优化工具([inference_transpiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/inference_transpiler.py))
334+
```python
335+
class InferenceTranspiler:
336+
def transpile(self, program, place, scope=None):
337+
...
338+
if scope is None:
339+
scope = global_scope()
340+
...
341+
```
342+
- 使用`InferenceTranspiler`将会直接修改`program`
343+
- 使用`InferenceTranspiler`会修改参数的值,请确保`program`的参数在`scope`内。
344+
- 支持的优化
345+
- 融合batch_norm op的计算
346+
- 使用示例([链接](https://github.com/Xreki/Xreki.github.io/blob/master/fluid/inference/inference_transpiler.py))
347+
```python
348+
import paddle.fluid as fluid
349+
# NOTE: Applying the inference transpiler will change the inference_program.
350+
t = fluid.InferenceTranspiler()
351+
t.transpile(inference_program, place, inference_scope)
352+
```
353+
354+
355+
356+
357+
## 内存使用优化
358+
- 使用Python内存优化工具([memory_optimization_transipiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/memory_optimization_transpiler.py))
359+
```python
360+
fluid.memory_optimize(inference_program)
361+
```

0 commit comments

Comments
 (0)