Skip to content
This repository was archived by the owner on Feb 3, 2025. It is now read-only.

Commit bd74e79

Browse files
Added dynamic shapes and deployment on triton examples (#293)
1 parent c431aed commit bd74e79

File tree

5 files changed

+915
-0
lines changed

5 files changed

+915
-0
lines changed
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Tensorflow with TensorRT (TF-TRT) to Triton
2+
3+
This README showcases how to deploy a simple ResNet model accelerated by using Tensorflow-TensorRT on Triton Inference Server.
4+
5+
## Step 1: Optimize your model with TensorFlow-TensorRT
6+
7+
If you are unfamiliar with Tensorflow-TensorRT, please refer to this [video](https://www.youtube.com/watch?v=w7871kMiAs8&ab_channel=NVIDIADeveloper). The first step in this pipeline is to accelerate your model. If you use TensorFlow as your framework of choice for training, you can either use TensorRT or TensorFlow-TensorRT, depending on your model's operations.
8+
9+
For using Tensorflow-TensorRT, let's first pull the NGC TensorFlow Docker container, which comes installed with both TensorRT and Tensorflow-TensorRT. You may need to create an account and get the API key from [here](https://ngc.nvidia.com/setup/). Sign up and login with your key (follow the instructions [here](https://ngc.nvidia.com/setup/api-key) after signing up).
10+
11+
```
12+
# <xx.xx> is the yy:mm for the publishing tag for NVIDIA's Tensorflow
13+
# container; eg. 22.04
14+
15+
docker run -it --gpus all -v /path/to/this/folder:/resnet50_eg nvcr.io/nvidia/tensorflow:<xx.xx>-tf2-py3
16+
```
17+
18+
We have already made a sample to use Tensorflow-TensorRT: `tf_trt_resnet50.py`. This sample downloads a ResNet model from Keras and then optimizes it with TensorFlow-TensorRT. For more examples, visit the TF-TRT [Github Repository](https://github.com/tensorflow/tensorrt).
19+
20+
```
21+
python tf_trt_resnet50.py
22+
23+
# you can exit out of this container now
24+
exit
25+
```
26+
27+
## Step 2: Set Up Triton Inference Server
28+
29+
If you are new to the Triton Inference Server and want to learn more, we highly recommend checking out our [Github Repository](https://github.com/triton-inference-server).
30+
31+
To use Triton, we need to make a model repository. The structure of the repository should look something like this:
32+
```
33+
model_repository
34+
|
35+
+-- resnet50
36+
|
37+
+-- config.pbtxt
38+
+-- 1
39+
|
40+
+-- model.savedmodel
41+
|
42+
+-- saved_model.pb
43+
+-- variables
44+
|
45+
+-- variables.data-00000-of-00001
46+
+-- variables.index
47+
```
48+
49+
A sample model configuration of the model is included with this demo as `config.pbtxt`. If you are new to Triton, we highly encourage you to check out this [section of our documentation](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md) for more details. Once you have the model repository setup, it is time to launch the Triton server! You can do that with the docker command below.
50+
```
51+
docker run --gpus all --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /full/path/to/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:<xx.yy>-py3 tritonserver --model-repository=/models --backend-config=tensorflow,version=2
52+
```
53+
54+
## Step 3: Using a Triton Client to Query the Server
55+
56+
Download an example image to test inference.
57+
58+
```
59+
wget -O img1.jpg "https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg"
60+
```
61+
62+
Install dependencies.
63+
```
64+
pip install --upgrade tensorflow
65+
pip install pillow
66+
pip install nvidia-pyindex
67+
pip install tritonclient[all]
68+
```
69+
70+
Run client
71+
```
72+
python3 triton_client.py
73+
```
74+
75+
The output of the same should look like below:
76+
```
77+
[b'0.301167:90' b'0.169790:14' b'0.161309:92' b'0.093105:94'
78+
b'0.058743:136' b'0.050185:11' b'0.033802:91' b'0.011760:88'
79+
b'0.008309:989' b'0.004927:95' b'0.004905:13' b'0.004095:317'
80+
b'0.004006:96' b'0.003694:12' b'0.003526:42' b'0.003390:313'
81+
...
82+
b'0.000001:751' b'0.000001:685' b'0.000001:408' b'0.000001:116'
83+
b'0.000001:627' b'0.000001:933' b'0.000000:661' b'0.000000:148']
84+
```
85+
The output format here is `<confidence_score>:<classification_index>`. To learn how to map these to the label names and more, refer to our [documentation](https://github.com/triton-inference-server/server/blob/main/docs/protocol/extension_classification.md).
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Copyright 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
name: "resnet50"
28+
platform: "tensorflow_savedmodel"
29+
max_batch_size : 0
30+
input [
31+
{
32+
name: "input_1"
33+
data_type: TYPE_FP32
34+
dims: [-1, 224, 224, 3 ]
35+
}
36+
]
37+
output [
38+
{
39+
name: "predictions"
40+
data_type: TYPE_FP32
41+
dims: [-1, 1000]
42+
}
43+
]
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Copyright 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
import tensorflow as tf
28+
from tensorflow.python.compiler.tensorrt import trt_convert as trt
29+
from tensorflow.keras.applications.resnet50 import ResNet50
30+
31+
# Load model0
32+
model = ResNet50(weights='imagenet')
33+
model.save('resnet50_saved_model')
34+
35+
# Optimize with tftrt
36+
converter = trt.TrtGraphConverterV2(input_saved_model_dir='resnet50_saved_model', precision_mode=trt.TrtPrecisionMode.FP32, max_workspace_size_bytes=8000000000)
37+
converter.convert()
38+
39+
# Save the model
40+
converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP32')
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Copyright 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
from tensorflow.keras.preprocessing import image
28+
from tensorflow.keras.applications.resnet50 import preprocess_input
29+
import tensorflow as tf
30+
31+
import numpy as np
32+
import tritonclient.http as httpclient
33+
from tritonclient.utils import triton_to_np_dtype
34+
35+
def process_image(image_path="img1.jpg"):
36+
img = image.load_img(image_path, target_size=(224, 224))
37+
x = image.img_to_array(img)
38+
x = np.expand_dims(x, axis=0)
39+
return preprocess_input(x)
40+
41+
transformed_img = process_image()
42+
43+
# Setting up client
44+
triton_client = httpclient.InferenceServerClient(url="localhost:8000")
45+
46+
test_input = httpclient.InferInput("input_1", transformed_img.shape, datatype="FP32")
47+
test_input.set_data_from_numpy(transformed_img, binary_data=True)
48+
49+
test_output = httpclient.InferRequestedOutput("predictions", binary_data=True, class_count=1000)
50+
51+
# Querying the server
52+
results = triton_client.infer(model_name="resnet50", inputs=[test_input], outputs=[test_output])
53+
54+
test_output_fin = results.as_numpy('predictions')
55+
print(test_output_fin)

0 commit comments

Comments
 (0)