|
| 1 | +# Use C-API for Model Inference |
| 2 | + |
| 3 | +There are several examples in this directory about how to use Paddle C-API for model inference. |
| 4 | + |
| 5 | +## Convert configuration file to protobuf binary. |
| 6 | + |
| 7 | +Firstly, the user should convert Paddle's model configuration file into a protobuf binary file. In each example directory, there is a file named `convert_protobin.sh`. It will convert `trainer_config.conf` into `trainer_config.bin`. |
| 8 | + |
| 9 | +The `convert_protobin.sh` is very simple, just invoke `dump_config` Python module to dump the binary file. The command line usages are: |
| 10 | + |
| 11 | +```bash |
| 12 | +python -m paddle.utils.dump_config YOUR_CONFIG_FILE 'CONFIG_EXTRA_ARGS' --binary > YOUR_CONFIG_FILE.bin |
| 13 | +``` |
| 14 | + |
| 15 | +## Initialize paddle |
| 16 | + |
| 17 | +```c++ |
| 18 | +char* argv[] = {"--use_gpu=False"}; |
| 19 | +paddle_init(1, (char**)argv); |
| 20 | +``` |
| 21 | +
|
| 22 | +We must initialize global context before we invoke other interfaces in Paddle. The initialize commands just like the `paddle_trainer` command line arguments. `paddle train --help`, will show the list of arguments. The most important argument is `use_gpu` or not. |
| 23 | +
|
| 24 | +## Load network and parameters |
| 25 | +
|
| 26 | +```c |
| 27 | +paddle_gradient_machine machine; |
| 28 | +paddle_gradient_machine_create_for_inference(&machine, config_file_content, content_size)); |
| 29 | +paddle_gradient_machine_load_parameter_from_disk(machine, "./some_where_to_params")); |
| 30 | +``` |
| 31 | + |
| 32 | +The gradient machine is a Paddle concept, which represents a neural network can be forwarded and backward. We can create a gradient machine fo model inference, and load the parameter files from disk. |
| 33 | + |
| 34 | +Moreover, if we want to inference in multi-thread, we could create a thread local gradient machine which shared the same parameter by using `paddle_gradient_machine_create_shared_param` API. Please reference `multi_thread` as an example. |
| 35 | + |
| 36 | +## Create input |
| 37 | + |
| 38 | +The input of a neural network is an `arguments`. The examples in this directory will show how to construct different types of inputs for prediction. Please look at `dense`, `sparse_binary`, `sequence` for details. |
| 39 | + |
| 40 | +## Get inference |
| 41 | + |
| 42 | +After invoking `paddle_gradient_machine_forward`, we could get the output of the neural network. The `value` matrix of output arguments will store the neural network output values. If the output is a `SoftmaxActivation`, the `value` matrix are the probabilities of each input samples. The height of output matrix is number of sample. The width is the number of categories. |
0 commit comments