|
| 1 | +# Fluid Inference使用指南 |
| 2 | + |
| 3 | +- Python Inference API |
| 4 | +- 编译Fluid Inference库 |
| 5 | +- Inference C++ API |
| 6 | +- Inference实例 |
| 7 | +- Inference计算优化 |
| 8 | + |
| 9 | +## Python Inference API **[改进中]** |
| 10 | +- [保存Inference模型](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L295) |
| 11 | + |
| 12 | + ```python |
| 13 | + def save_inference_model(dirname, |
| 14 | + feeded_var_names, |
| 15 | + target_vars, |
| 16 | + executor, |
| 17 | + main_program=None, |
| 18 | + model_filename=None, |
| 19 | + params_filename=None): |
| 20 | + ``` |
| 21 | + Inference模型和参数将会保存到`dirname`目录下: |
| 22 | + - 序列化的模型 |
| 23 | + - `model_filename`为`None`,保存到`dirname/__model__` |
| 24 | + - `model_filename`非`None`,保存到`dirname/model_filename` |
| 25 | + - 参数 |
| 26 | + - `params_filename`为`None`,单独保存到各个独立的文件,各文件以参数变量的名字命名 |
| 27 | + - `params_filename`非`None`,保存到`dirname/params_filename` |
| 28 | + |
| 29 | +- 两种存储格式 |
| 30 | + - 参数保存到各个独立的文件 |
| 31 | + - 如,设置`model_filename`为`None`、`params_filename`为`None` |
| 32 | + |
| 33 | + ```bash |
| 34 | + $ cd recognize_digits_conv.inference.model |
| 35 | + $ ls |
| 36 | + $ __model__ batch_norm_1.w_0 batch_norm_1.w_2 conv2d_2.w_0 conv2d_3.w_0 fc_1.w_0 batch_norm_1.b_0 batch_norm_1.w_1 conv2d_2.b_0 conv2d_3.b_0 fc_1.b_0 |
| 37 | + ``` |
| 38 | + - 参数保存到同一个文件 |
| 39 | + - 如,设置`model_filename`为`None`、`params_filename`为`__params__` |
| 40 | + |
| 41 | + ```bash |
| 42 | + $ cd recognize_digits_conv.inference.model |
| 43 | + $ ls |
| 44 | + $ __model__ __params__ |
| 45 | + ``` |
| 46 | +- [加载Inference模型](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/io.py#L380) |
| 47 | + ```python |
| 48 | + def load_inference_model(dirname, |
| 49 | + executor, |
| 50 | + model_filename=None, |
| 51 | + params_filename=None): |
| 52 | + ... |
| 53 | + return [program, feed_target_names, fetch_targets] |
| 54 | + ``` |
| 55 | + |
| 56 | + |
| 57 | +## 编译Fluid Inference库 |
| 58 | + |
| 59 | + - **不需要额外的CMake选项** |
| 60 | + - 1、 配置CMake命令,更多配置请参考[源码编译PaddlePaddle](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/build_from_source_cn.html) |
| 61 | + ```bash |
| 62 | + $ git clone https://github.com/PaddlePaddle/Paddle.git |
| 63 | + $ cd Paddle |
| 64 | + $ mkdir build |
| 65 | + $ cd build |
| 66 | + $ cmake -DCMAKE_INSTALL_PREFIX=your/path/to/paddle_inference_lib \ |
| 67 | + -DCMAKE_BUILD_TYPE=Release \ |
| 68 | + -DWITH_PYTHON=ON \ |
| 69 | + -DWITH_MKL=OFF \ |
| 70 | + -DWITH_GPU=OFF \ |
| 71 | + .. |
| 72 | + ``` |
| 73 | + |
| 74 | + - 2、 编译PaddlePaddle |
| 75 | + ```bash |
| 76 | + $ make |
| 77 | + ``` |
| 78 | + |
| 79 | + - 3、 部署。执行如下命令将PaddlePaddle Fluid Inference库部署到`your/path/to/paddle_inference_lib`目录。 |
| 80 | + ```bash |
| 81 | + $ make inference_lib_dist |
| 82 | + ``` |
| 83 | + |
| 84 | +- 目录结构 |
| 85 | + |
| 86 | + ```bash |
| 87 | + $ cd your/path/to/paddle_inference_lib |
| 88 | + $ tree |
| 89 | + . |
| 90 | + |-- paddle |
| 91 | + | `-- fluid |
| 92 | + | |-- framework |
| 93 | + | |-- inference |
| 94 | + | | |-- io.h |
| 95 | + | | `-- libpaddle_fluid.so |
| 96 | + | |-- memory |
| 97 | + | |-- platform |
| 98 | + | `-- string |
| 99 | + |-- third_party |
| 100 | + | |-- eigen3 |
| 101 | + | `-- install |
| 102 | + | |-- gflags |
| 103 | + | |-- glog |
| 104 | + | `-- protobuf |
| 105 | + `-- ... |
| 106 | + ``` |
| 107 | + |
| 108 | + 假设`PADDLE_ROOT=your/path/to/paddle_inference_lib`。 |
| 109 | + |
| 110 | + |
| 111 | + |
| 112 | +## 链接Fluid Inference库 |
| 113 | +- [示例项目](https://github.com/luotao1/fluid_inference_example.git) |
| 114 | + |
| 115 | + - GCC配置 |
| 116 | + ```bash |
| 117 | + $ g++ -o a.out -std=c++11 main.cc \ |
| 118 | + -I${PADDLE_ROOT}/ \ |
| 119 | + -I${PADDLE_ROOT}/third_party/install/gflags/include \ |
| 120 | + -I${PADDLE_ROOT}/third_party/install/glog/include \ |
| 121 | + -I${PADDLE_ROOT}/third_party/install/protobuf/include \ |
| 122 | + -I${PADDLE_ROOT}/third_party/eigen3 \ |
| 123 | + -L${PADDLE_ROOT}/paddle/fluid/inference -lpaddle_fluid \ |
| 124 | + -lrt -ldl -lpthread |
| 125 | + ``` |
| 126 | + |
| 127 | + - CMake配置 |
| 128 | + ```cmake |
| 129 | + include_directories(${PADDLE_ROOT}/) |
| 130 | + include_directories(${PADDLE_ROOT}/third_party/install/gflags/include) |
| 131 | + include_directories(${PADDLE_ROOT}/third_party/install/glog/include) |
| 132 | + include_directories(${PADDLE_ROOT}/third_party/install/protobuf/include) |
| 133 | + include_directories(${PADDLE_ROOT}/third_party/eigen3) |
| 134 | + target_link_libraries(${TARGET_NAME} |
| 135 | + ${PADDLE_ROOT}/paddle/fluid/inference/libpaddle_fluid.so |
| 136 | + -lrt -ldl -lpthread) |
| 137 | + ``` |
| 138 | + |
| 139 | + - 设置环境变量: |
| 140 | + `export LD_LIBRARY_PATH=${PADDLE_ROOT}/paddle/fluid/inference:$LD_LIBRARY_PATH` |
| 141 | + |
| 142 | + |
| 143 | + |
| 144 | +## C++ Inference API |
| 145 | + |
| 146 | +- [推断流程](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_helper.h#L91) |
| 147 | + |
| 148 | + - 1、 初始化设备 |
| 149 | + ```cpp |
| 150 | + #include "paddle/fluid/framework/init.h" |
| 151 | + paddle::framework::InitDevices(false); |
| 152 | + ``` |
| 153 | + |
| 154 | + - 2、 定义place,executor,scope |
| 155 | + ```cpp |
| 156 | + auto place = paddle::platform::CPUPlace(); |
| 157 | + auto executor = paddle::framework::Executor(place); |
| 158 | + auto* scope = new paddle::framework::Scope(); |
| 159 | + ``` |
| 160 | + |
| 161 | + - 3、 加载模型 |
| 162 | + ```cpp |
| 163 | + #include "paddle/fluid/inference/io.h" |
| 164 | + auto inference_program = paddle::inference::Load(executor, *scope, dirname); |
| 165 | + // or |
| 166 | + auto inference_program = paddle::inference::Load(executor, |
| 167 | + *scope, |
| 168 | + dirname + "/" + model_filename, |
| 169 | + dirname + "/" + params_filename); |
| 170 | + ``` |
| 171 | + |
| 172 | + - 4、 获取`feed_target_names`和`fetch_target_names` |
| 173 | + ```cpp |
| 174 | + const std::vector<std::string>& feed_target_names = inference_program->GetFeedTargetNames(); |
| 175 | + const std::vector<std::string>& fetch_target_names = inference_program->GetFetchTargetNames(); |
| 176 | + ``` |
| 177 | + |
| 178 | + - 5、 准备`feed`数据 |
| 179 | + ```cpp |
| 180 | + #include "paddle/fluid/framework/lod_tensor.h" |
| 181 | + std::vector<paddle::framework::LoDTensor*> cpu_feeds; |
| 182 | + ... |
| 183 | + std::map<std::string, const paddle::framework::LoDTensor*> feed_targets; |
| 184 | + for (size_t i = 0; i < feed_target_names.size(); ++i) { |
| 185 | + // Please make sure that cpu_feeds[i] is right for feed_target_names[i] |
| 186 | + feed_targets[feed_target_names[i]] = cpu_feeds[i]; |
| 187 | + } |
| 188 | + ``` |
| 189 | +
|
| 190 | + - 6、 定义`Tensor`来`fetch`结果 |
| 191 | + ```cpp |
| 192 | + std::vector<paddle::framework::LoDTensor*> cpu_fetchs; |
| 193 | + std::map<std::string, paddle::framework::LoDTensor*> fetch_targets; |
| 194 | + for (size_t i = 0; i < fetch_target_names.size(); ++i) { |
| 195 | + fetch_targets[fetch_target_names[i]] = cpu_fetchs[i]; |
| 196 | + } |
| 197 | + ``` |
| 198 | +
|
| 199 | + - 7、 执行`inference_program` |
| 200 | + ```cpp |
| 201 | + executor.Run(*inference_program, scope, feed_targets, fetch_targets); |
| 202 | + ``` |
| 203 | +
|
| 204 | + - 8、 使用`fetch`数据 |
| 205 | + ```cpp |
| 206 | + for (size_t i = 0; i < cpu_fetchs.size(); ++i) { |
| 207 | + std::cout << "lod_i: " << cpu_fetchs[i]->lod(); |
| 208 | + std::cout << "dims_i: " << cpu_fetchs[i]->dims(); |
| 209 | + std::cout << "result:"; |
| 210 | + float* output_ptr = cpu_fetchs[i]->data<float>(); |
| 211 | + for (int j = 0; j < cpu_fetchs[i]->numel(); ++j) { |
| 212 | + std::cout << " " << output_ptr[j]; |
| 213 | + } |
| 214 | + std::cout << std::endl; |
| 215 | + } |
| 216 | + ``` |
| 217 | + 针对不同的数据,4. - 8.可执行多次。 |
| 218 | +
|
| 219 | + - 9、 释放内存 |
| 220 | + ```cpp |
| 221 | + delete scope; |
| 222 | + ``` |
| 223 | +
|
| 224 | +
|
| 225 | +- 接口说明 |
| 226 | +
|
| 227 | + ```cpp |
| 228 | + void Run(const ProgramDesc& program, Scope* scope, |
| 229 | + std::map<std::string, const LoDTensor*>& feed_targets, |
| 230 | + std::map<std::string, LoDTensor*>& fetch_targets, |
| 231 | + bool create_vars = true, |
| 232 | + const std::string& feed_holder_name = "feed", |
| 233 | + const std::string& fetch_holder_name = "fetch"); |
| 234 | + ``` |
| 235 | + - 使用Python API `save_inference_model`保存的`program`里面包含了`feed_op`和`fetch_op`,用户提供的`feed_targets`、`fetch_targets`必须和`inference_program`中的`feed_op`、`fetch_op`保持一致。 |
| 236 | + - 用户提供的`feed_holder_name`和`fetch_holder_name`也必须和`inference_program`中`feed_op`、`fetch_op`保持一致,可使用`SetFeedHolderName`和`SetFetchHolderName`接口重新设置`inferece_program` |
| 237 | + - 默认情况下,除了`persistable`属性设置为`True`的`Variable`之外,每次执行`executor.Run`会创建一个局部`Scope`,并且在这个局部`Scope`中创建和销毁所有的`Variable`,以最小化空闲时的内存占用。 |
| 238 | + - `persistable`属性为`True`的`Variable`有: |
| 239 | + - Operators的参数`w`、`b`等 |
| 240 | + - `feed_op`的输入变量 |
| 241 | + - `fetch_op`的输出变量 |
| 242 | +
|
| 243 | +
|
| 244 | +- **不在每次执行时创建和销毁变量 |
| 245 | + [PR](https://github.com/PaddlePaddle/Paddle/pull/9301)** |
| 246 | + - 执行`inference_program` |
| 247 | + ```cpp |
| 248 | + // Call once |
| 249 | + executor.CreateVariables(*inference_program, scope, 0); |
| 250 | + // Call as many times as you like |
| 251 | + executor.Run( |
| 252 | + *inference_program, scope, feed_targets, fetch_targets, false); |
| 253 | + ``` |
| 254 | + - **优点** |
| 255 | + - 节省了频繁创建、销毁变量的时间(约占每次`Run`总时间的1% ~ 12%) |
| 256 | + - 执行结束后可获取所有Operators的计算结果 |
| 257 | + - **缺点** |
| 258 | + - 空闲时也会占用大量的内存 |
| 259 | + - 在同一个`Scope`中,相同的变量名是公用同一块内存的,容易引起意想不到的错误 |
| 260 | +
|
| 261 | +
|
| 262 | +- **不在每次执行时创建Op [PR](https://github.com/PaddlePaddle/Paddle/pull/9630)** |
| 263 | + - 执行`inference_program` |
| 264 | + ```cpp |
| 265 | + // Call once |
| 266 | + auto ctx = executor.Prepare(*inference_program, 0); |
| 267 | + // Call as many times as you like if you have no need to change the inference_program |
| 268 | + executor.RunPreparedContext(ctx.get(), scope, feed_targets, fetch_targets); |
| 269 | + ``` |
| 270 | + - **优点** |
| 271 | + - 节省了频繁创建、销毁Op的时间 |
| 272 | + - **缺点** |
| 273 | + - 一旦修改了`inference_program`,则需要重新创建`ctx` |
| 274 | +
|
| 275 | +
|
| 276 | +- **[多线程共享Parameters](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/test_multi_thread_helper.h)** |
| 277 | + - 主线程 |
| 278 | + - 1、 初始化设备 |
| 279 | + - 2、 定义`place`,`executor`,`scope` |
| 280 | + - 3、 加载模型,得到`inference_program` |
| 281 | + - 从线程 |
| 282 | + - **复制`inference_program`得到`copy_program`,修改`copy_program`的`feed_holder_name`和`fetch_holder_name`** |
| 283 | + ```cpp |
| 284 | + auto copy_program = std::unique_ptr<paddle::framework::ProgramDesc>( |
| 285 | + new paddle::framework::ProgramDesc(*inference_program)); |
| 286 | + std::string feed_holder_name = "feed_" + paddle::string::to_string(thread_id); |
| 287 | + std::string fetch_holder_name = "fetch_" + paddle::string::to_string(thread_id); |
| 288 | + copy_program->SetFeedHolderName(feed_holder_name); |
| 289 | + copy_program->SetFetchHolderName(fetch_holder_name); |
| 290 | + ``` |
| 291 | + - 4、 获取`copy_program`的`feed_target_names`和`fetch_target_names` |
| 292 | + - 5、 准备feed数据,定义Tensor来fetch结果 |
| 293 | + - 6、 执行`copy_program` |
| 294 | + ```cpp |
| 295 | + executor->Run(*copy_program, scope, feed_targets, fetch_targets, true, feed_holder_name, fetch_holder_name); |
| 296 | + ``` |
| 297 | + - 7、 使用fetch数据 |
| 298 | + - 主线程 |
| 299 | + - 8、 释放资源 |
| 300 | +
|
| 301 | +
|
| 302 | +- 基本概念 |
| 303 | + - 数据相关: |
| 304 | + - [Tensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/tensor.md),一个N维数组,数据可以是任意类型(int,float,double等) |
| 305 | + - [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/lod_tensor.md),带LoD(Level-of-Detail)即序列信息的Tensor |
| 306 | + - [Scope](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md),记录了变量Variable |
| 307 | + - 执行相关: |
| 308 | + - [Executor](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/executor.md),无状态执行器,只跟设备相关 |
| 309 | + - Place |
| 310 | + - CPUPlace,CPU设备 |
| 311 | + - CUDAPlace,CUDA GPU设备 |
| 312 | + - 神经网络表示: |
| 313 | + - [Program](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/concepts/program.md) |
| 314 | +
|
| 315 | + 详细介绍请参考[**Paddle Fluid开发者指南**](https://github.com/lcy-seso/learning_notes/blob/master/Fluid/developer's_guid_for_Fluid/Developer's_Guide_to_Paddle_Fluid.md) |
| 316 | +
|
| 317 | +
|
| 318 | +
|
| 319 | +## Inference实例 |
| 320 | +
|
| 321 | + 1. fit a line: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_fit_a_line.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_fit_a_line.cc) |
| 322 | + 1. image classification: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_image_classification.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_image_classification.cc) |
| 323 | + 1. label semantic roles: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_label_semantic_roles.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_label_semantic_roles.cc) |
| 324 | + 1. recognize digits: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recognize_digits.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recognize_digits.cc) |
| 325 | + 1. recommender system: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recommender_system.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_recommender_system.cc) |
| 326 | + 1. understand sentiment: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_understand_sentiment.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_understand_sentiment.cc) |
| 327 | + 1. word2vec: [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_word2vec.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/book/test_inference_word2vec.cc) |
| 328 | +
|
| 329 | +
|
| 330 | +## Inference计算优化 |
| 331 | +- 使用Python推理优化工具[inference_transpiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/inference_transpiler.py) |
| 332 | + ```python |
| 333 | + class InferenceTranspiler: |
| 334 | + def transpile(self, program, place, scope=None): |
| 335 | + ... |
| 336 | + if scope is None: |
| 337 | + scope = global_scope() |
| 338 | + ... |
| 339 | + ``` |
| 340 | + - 使用`InferenceTranspiler`将会直接修改`program`。 |
| 341 | + - 使用`InferenceTranspiler`会修改参数的值,请确保`program`的参数在`scope`内。 |
| 342 | +- 支持的优化 |
| 343 | + - 融合batch_norm op的计算 |
| 344 | +- [使用示例](https://github.com/Xreki/Xreki.github.io/blob/master/fluid/inference/inference_transpiler.py) |
| 345 | + ```python |
| 346 | + import paddle.fluid as fluid |
| 347 | + # NOTE: Applying the inference transpiler will change the inference_program. |
| 348 | + t = fluid.InferenceTranspiler() |
| 349 | + t.transpile(inference_program, place, inference_scope) |
| 350 | + ``` |
| 351 | +
|
| 352 | +
|
| 353 | +
|
| 354 | +
|
| 355 | +## 内存使用优化 |
| 356 | +- 使用Python内存优化工具[memory_optimization_transipiler](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/memory_optimization_transpiler.py) |
| 357 | + ```python |
| 358 | + fluid.memory_optimize(inference_program) |
| 359 | + ``` |
0 commit comments