Skip to content

Commit f9231b7

Browse files
authored
add a demo of triton inference serving (#202)
* add a demo of triton inference serving * fix format and remove some comment
1 parent d29fa74 commit f9231b7

File tree

10 files changed

+611
-2
lines changed

10 files changed

+611
-2
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,5 +117,7 @@ pretrained/*
117117
dist_train.sh
118118
openvino/build/*
119119
openvino/output*
120+
*.onnx
121+
tis/cpp_client/build/*
120122

121123
tvm/

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,9 @@ You can go to [ncnn](./ncnn) for details.
3737
3. openvino
3838
You can go to [openvino](./openvino) for details.
3939

40+
4. tis
41+
Triton Inference Server(TIS) provides a service solution of deployment. You can go to [tis](./tis) for details.
42+
4043

4144
## platform
4245

openvino/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ My cpu is Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz.
1212
1.Train the model and export it to onnx
1313
```
1414
$ cd BiSeNet/
15-
$ python tools/export_onnx.py --aux-mode eval --config configs/bisenetv2_city.py --weight-path /path/to/your/model.pth --outpath ./model_v2.onnx
15+
$ python tools/export_onnx.py --config configs/bisenetv2_city.py --weight-path /path/to/your/model.pth --outpath ./model_v2.onnx
1616
```
1717
(Optional) 2.Install 'onnx-simplifier' to simplify the generated onnx model:
1818
```

tis/README.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
2+
3+
## A simple demo of using trition-inference-serving
4+
5+
### Platform
6+
7+
* ubuntu 18.04
8+
* cmake-3.22.0
9+
* 8 Tesla T4 gpu
10+
11+
12+
### Serving Model
13+
14+
#### 1. prepare model repository
15+
16+
We need to export our model to onnx and copy it to model repository:
17+
```
18+
$ cd BiSeNet
19+
$ python tools/export_onnx.py --config configs/bisenetv1_city.py --weight-path /path/to/your/model.pth --outpath ./model.onnx
20+
$ cp -riv ./model.onnx tis/models/bisenetv1/1
21+
22+
$ python tools/export_onnx.py --config configs/bisenetv2_city.py --weight-path /path/to/your/model.pth --outpath ./model.onnx
23+
$ cp -riv ./model.onnx tis/models/bisenetv2/1
24+
```
25+
26+
#### 2. start service
27+
We start serving with docker:
28+
```
29+
$ docker pull nvcr.io/nvidia/tritonserver:21.10-py3
30+
$ docker run --gpus all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /path/to/BiSeNet/tis/models:/models nvcr.io/nvidia/tritonserver:21.10-py3 tritonserver --model-repository=/models
31+
```
32+
33+
In general, the service would start now. You can check whether service has started by:
34+
```
35+
$ curl -v localhost:8000/v2/health/ready
36+
```
37+
38+
By default, we use gpu 0 and gpu 1, you can change configurations in the `config.pbtxt` file.
39+
40+
41+
### Client
42+
43+
We call the model service with both python and c++ method.
44+
45+
46+
#### 1. python method
47+
48+
Firstly, we need to install dependency package:
49+
```
50+
$ python -m pip install tritonclient[all]==2.15.0
51+
```
52+
53+
Then we can run the script:
54+
```
55+
$ cd BiSeNet/tis
56+
$ python client.py
57+
```
58+
59+
This would generate a result file named `res.jpg` in `BiSeNet/tis` directory.
60+
61+
62+
#### 2. c++ method
63+
64+
We need to compile c++ client library from source:
65+
```
66+
$ apt install rapidjson-dev
67+
$ mkdir -p /data/ $$ cd /data/
68+
$ git clone https://github.com/triton-inference-server/client.git
69+
$ cd client && git reset --hard da04158bc094925a56b
70+
$ mkdir -p build && cd build
71+
$ cmake -DCMAKE_INSTALL_PREFIX=/opt/triton_client -DTRITON_ENABLE_CC_HTTP=ON -DTRITON_ENABLE_CC_GRPC=ON -DTRITON_ENABLE_PERF_ANALYZER=OFF -DTRITON_ENABLE_PYTHON_HTTP=OFF -DTRITON_ENABLE_PYTHON_GRPC=OFF -DTRITON_ENABLE_JAVA_HTTP=OFF -DTRITON_ENABLE_GPU=ON -DTRITON_ENABLE_EXAMPLES=OFF -DTRITON_ENABLE_TESTS=ON ..
72+
$ make cc-clients
73+
```
74+
The above commands are exactly what I used to compile the library. I learned these commands from the official document.
75+
76+
Also, We need to install `cmake` with version `3.22`.
77+
78+
Optionally, I compiled opencv from source and install it to `/opt/opencv`. You can first skip this and see whether you meet problems. If you have problems about opencv in the following steps, you can compile opencv as what I do.
79+
80+
After installing the dependencies, we can compile our c++ client:
81+
```
82+
$ cd BiSeNet/tis/cpp_client
83+
$ mkdir -p build && cd build
84+
$ cmake .. && make
85+
```
86+
87+
Finally, we run the client and see a result file named `res.jpg` generated:
88+
```
89+
./client
90+
```
91+
92+
93+
### In the end
94+
95+
This is a simple demo with only basic function. There are many other features that is useful, such as shared memory and model pipeline. If you have interest on this, you can learn more in the official document.

tis/client.py

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
2+
import numpy as np
3+
import cv2
4+
5+
import grpc
6+
7+
from tritonclient.grpc import service_pb2, service_pb2_grpc
8+
import tritonclient.grpc.model_config_pb2 as mc
9+
10+
11+
np.random.seed(123)
12+
palette = np.random.randint(0, 256, (100, 3))
13+
14+
15+
16+
# url = '10.128.61.7:8001'
17+
url = '127.0.0.1:8001'
18+
model_name = 'bisenetv2'
19+
model_version = '1'
20+
inp_name = 'input_image'
21+
outp_name = 'preds'
22+
inp_dtype = 'FP32'
23+
outp_dtype = np.int64
24+
inp_shape = [1, 3, 1024, 2048]
25+
outp_shape = [1024, 2048]
26+
impth = '../example.png'
27+
mean = [0.3257, 0.3690, 0.3223] # city, rgb
28+
std = [0.2112, 0.2148, 0.2115]
29+
30+
31+
option = [
32+
('grpc.max_receive_message_length', 1073741824),
33+
('grpc.max_send_message_length', 1073741824),
34+
]
35+
channel = grpc.insecure_channel(url, options=option)
36+
grpc_stub = service_pb2_grpc.GRPCInferenceServiceStub(channel)
37+
38+
39+
metadata_request = service_pb2.ModelMetadataRequest(
40+
name=model_name, version=model_version)
41+
metadata_response = grpc_stub.ModelMetadata(metadata_request)
42+
print(metadata_response)
43+
44+
config_request = service_pb2.ModelConfigRequest(
45+
name=model_name,
46+
version=model_version)
47+
config_response = grpc_stub.ModelConfig(config_request)
48+
print(config_response)
49+
50+
51+
request = service_pb2.ModelInferRequest()
52+
request.model_name = model_name
53+
request.model_version = model_version
54+
55+
inp = service_pb2.ModelInferRequest().InferInputTensor()
56+
inp.name = inp_name
57+
inp.datatype = inp_dtype
58+
inp.shape.extend(inp_shape)
59+
60+
61+
mean = np.array(mean).reshape(1, 1, 3)
62+
std = np.array(std).reshape(1, 1, 3)
63+
im = cv2.imread(impth)[:, :, ::-1]
64+
im = cv2.resize(im, dsize=tuple(inp_shape[-1:-3:-1]))
65+
im = ((im / 255.) - mean) / std
66+
im = im[None, ...].transpose(0, 3, 1, 2)
67+
inp_bytes = im.astype(np.float32).tobytes()
68+
69+
request.ClearField("inputs")
70+
request.ClearField("raw_input_contents")
71+
request.inputs.extend([inp,])
72+
request.raw_input_contents.extend([inp_bytes,])
73+
74+
75+
outp = service_pb2.ModelInferRequest().InferRequestedOutputTensor()
76+
outp.name = outp_name
77+
request.outputs.extend([outp,])
78+
79+
# sync
80+
# resp = grpc_stub.ModelInfer(request).raw_output_contents[0]
81+
# async
82+
resp = grpc_stub.ModelInfer.future(request)
83+
resp = resp.result().raw_output_contents[0]
84+
85+
out = np.frombuffer(resp, dtype=outp_dtype).reshape(*outp_shape)
86+
87+
out = palette[out]
88+
cv2.imwrite('res.png', out)

tis/cpp_client/CMakeLists.txt

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
cmake_minimum_required (VERSION 3.18)
2+
3+
project(Samples)
4+
5+
set(CMAKE_CXX_FLAGS "-std=c++14 -O1")
6+
set(CMAKE_BUILD_TYPE Release)
7+
8+
set(CMAKE_PREFIX_PATH
9+
/opt/triton_client/
10+
/opt/opencv/lib/cmake/opencv4)
11+
find_package(OpenCV REQUIRED)
12+
13+
include_directories(
14+
${CMAKE_CURRENT_SOURCE_DIR}
15+
${CMAKE_CURRENT_BINARY_DIR}
16+
${OpenCV_INCLUDE_DIRS}
17+
/opt/triton_client/include
18+
)
19+
link_directories(
20+
/opt/triton_client/lib
21+
)
22+
23+
24+
add_executable(client main.cpp)
25+
target_link_libraries(client PRIVATE
26+
grpcclient
27+
${OpenCV_LIBS}
28+
-lpthread
29+
)

0 commit comments

Comments
 (0)