Skip to content

Commit 430811a

Browse files
authored
Adding scripts for TensorRT (#3808)
* Adding scripts for TensorRT * Fixing README link * Responding to CR
1 parent e52a22c commit 430811a

File tree

4 files changed

+767
-0
lines changed

4 files changed

+767
-0
lines changed

research/tensorrt/README.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# Running the TensorFlow Official ResNet with TensorRT
2+
3+
[TensorRT](https://developer.nvidia.com/tensorrt) is NVIDIA's inference
4+
optimizer for deep learning. Briefly, TensorRT rewrites parts of the
5+
execution graph to allow for faster prediction times.
6+
7+
Here we provide a sample script that can:
8+
9+
1. Convert a TensorFlow SavedModel to a Frozen Graph.
10+
2. Load a Frozen Graph for inference.
11+
3. Time inference loops using the native TensorFlow graph.
12+
4. Time inference loops using FP32, FP16, or INT8<sup>1</sup> precision modes from TensorRT.
13+
14+
We provide some results below, as well as instructions for running this script.
15+
16+
<sup>1</sup> INT8 mode is a work in progress; please see [INT8 Mode is the Bleeding Edge](#int8-mode-is-the-bleeding-edge) below.
17+
18+
## How to Run This Script
19+
20+
### Step 1: Install Prerequisites
21+
22+
1. [Install TensorFlow.](https://www.tensorflow.org/install/)
23+
2. [Install TensorRT.](http://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html)
24+
3. We use the image processing functions from the [Official version of ResNet](/official/resnet/imagenet_preprocessing.py). Please checkout the Models repository if you haven't
25+
already, and add the Official Models to your Python path:
26+
27+
```
28+
git clone https://github.com/tensorflow/models.git
29+
export PYTHONPATH="$PYTHONPATH:/path/to/models"
30+
```
31+
32+
### Step 2: Get a model to test
33+
34+
The provided script runs with the [Official version of ResNet trained with
35+
ImageNet data](/official/resnet), but can be used for other models as well,
36+
as long as you have a SavedModel or a Frozen Graph.
37+
38+
You can download the ResNetv2-ImageNet [SavedModel](http://download.tensorflow.org/models/official/resnetv2_imagenet_savedmodel.tar.gz)
39+
or [Frozen Graph](http://download.tensorflow.org/models/official/resnetv2_imagenet_frozen_graph.pb),
40+
or, if you want to train the model yourself,
41+
pass `--export_dir` to the Official ResNet [imagenet_main.py](/official/resnet/imagenet_main.py).
42+
43+
When running this script, you can pass in a SavedModel directory containing the
44+
Protobuf MetaGraphDef and variables directory to `savedmodel_dir`, or pass in
45+
a Protobuf frozen graph file directly to `frozen_graph`. If you downloaded the
46+
SavedModel linked above, note that you should untar it before passing in to the
47+
script.
48+
49+
### Step 3: Get an image to test
50+
51+
The script can accept a JPEG image file to use for predictions. If none is
52+
provided, random data will be generated. We provide a sample `image.jpg` here
53+
which can be passed in with the `--image_file` flag.
54+
55+
### Step 4: Run the model
56+
57+
You have TensorFlow, TensorRT, a graph def, and a picture.
58+
Now it's time to time.
59+
60+
For the full set of possible parameters, you can run
61+
`python tensorrt.py --help`. Assuming you used the files provided above,
62+
you would run:
63+
64+
```
65+
python tensorrt.py --frozen_graph=resnetv2_imagenet_frozen_graph.pb \
66+
--image_file=image.jpg --native --fp32 --fp16 --output_dir=/my/output
67+
```
68+
69+
This will print the predictions for each of the precision modes that were run
70+
(native, which is the native precision of the model passed in, as well
71+
as the TensorRT version of the graph at precisions of fp32 and fp16):
72+
73+
```
74+
INFO:tensorflow:Starting timing.
75+
INFO:tensorflow:Timing loop done!
76+
Predictions:
77+
Precision: native [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus']
78+
Precision: FP32 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
79+
Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'lakeside, lakeshore', u'sandbar, sand bar', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty']
80+
```
81+
82+
The script will generate or append to a file in the output_dir, `log.txt`,
83+
which includes the timing information for each of the models:
84+
85+
```
86+
==========================
87+
network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
88+
fps median: 1041.4, mean: 1056.6, uncertainty: 2.8, jitter: 6.1
89+
latency median: 0.12292, mean: 0.12123, 99th_p: 0.13151, 99th_uncertainty: 0.00024
90+
91+
==========================
92+
network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
93+
fps median: 1253.0, mean: 1250.8, uncertainty: 3.4, jitter: 17.3
94+
latency median: 0.10215, mean: 0.10241, 99th_p: 0.11482, 99th_uncertainty: 0.01109
95+
96+
==========================
97+
network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
98+
fps median: 2280.2, mean: 2312.8, uncertainty: 10.3, jitter: 100.1
99+
latency median: 0.05614, mean: 0.05546, 99th_p: 0.06103, 99th_uncertainty: 0.00781
100+
101+
```
102+
103+
The script will also output the GraphDefs used for each of the modes run,
104+
for future use and inspection:
105+
106+
```
107+
ls /my/output
108+
log.txt
109+
tftrt_fp16_imagenet_frozen_graph.pb
110+
tftrt_fp32_imagenet_frozen_graph.pb
111+
```
112+
113+
## Troubleshooting and Notes
114+
115+
### INT8 Mode is the Bleeding Edge
116+
117+
Note that currently, INT8 mode results in a segfault using the models provided.
118+
We are working on it.
119+
120+
```
121+
E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger Parameter check failed at: Network.cpp::addScale::118, condition: shift.count == 0 || shift.count == weightCount
122+
Segmentation fault (core dumped)
123+
```
124+
125+
### GPU/Precision Compatibility
126+
127+
Not all GPUs support the ops required for all precisions. For example, P100s
128+
cannot currently run INT8 precision.
129+
130+
### Label Offsets
131+
132+
Some ResNet models represent 1000 categories, and some represent all 1001, with
133+
the 0th category being "background". The models provided are of the latter type.
134+
If you are using a different model and find that your predictions seem slightly
135+
off, try passing in the `--ids_are_one_indexed` arg, which adjusts the label
136+
alignment for models with only 1000 categories.
137+
138+
139+
## Model Links
140+
[ResNet-v2-ImageNet Frozen Graph](http://download.tensorflow.org/models/official/resnetv2_imagenet_frozen_graph.pb)
141+
142+
[ResNet-v2-ImageNet SavedModel](http://download.tensorflow.org/models/official/resnetv2_imagenet_savedmodel.tar.gz)
143+
144+
[ResNet-v1-ImageNet Frozen Graph](http://download.tensorflow.org/models/official/resnetv1_imagenet_frozen_graph.pb)
145+
146+
[ResNet-v1-ImageNet SavedModel](http://download.tensorflow.org/models/official/resnetv1_imagenet_savedmodel.tar.gz)

research/tensorrt/image.jpg

41.8 KB
Loading

research/tensorrt/labellist.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)