Skip to content

Commit 6ba95da

Browse files
committed
add openvino benchmark
1 parent 24f732f commit 6ba95da

File tree

5 files changed

+552
-69
lines changed

5 files changed

+552
-69
lines changed

README.md

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# [How to train an object detection model easy for free](https://www.dlology.com/blog/how-to-train-an-object-detection-model-easy-for-free/) | DLology Blog
22

33

4+
45
## How to Run
56

67
Easy way: run [this Colab Notebook](https://colab.research.google.com/github/Tony607/object_detection_demo/blob/master/tensorflow_object_detection_training_colab.ipynb).
@@ -25,7 +26,7 @@ python resize_images.py --raw-dir ./data/raw --save-dir ./data/images --ext jpg
2526
Resized images locate in `./data/images/`
2627
- Train/test split those files into two directories, `./data/images/train` and `./data/images/test`
2728

28-
- Annotate reized images with [labelImg](https://tzutalin.github.io/labelImg/), generate `xml` files inside `./data/images/train` and `./data/images/test` folders.
29+
- Annotate resized images with [labelImg](https://tzutalin.github.io/labelImg/), generate `xml` files inside `./data/images/train` and `./data/images/test` folders.
2930

3031
*Tips: use shortcuts (`w`: draw box, `d`: next file, `a`: previous file, etc.) to accelerate the annotation.*
3132

@@ -39,21 +40,47 @@ Resized images locate in `./data/images/`
3940
## How to run inference on frozen TensorFlow graph
4041

4142
Requirements:
42-
- `frozen_inference_graph.pb` Frozen TensorFlow object detection model downloaded from Colab after training.
43+
- `frozen_inference_graph.pb` Frozen TensorFlow object detection model downloaded from Colab after training.
4344
- `label_map.pbtxt` File used to map correct name for predicted class index downloaded from Colab after training.
4445

46+
You can also opt to download my [copy](https://github.com/Tony607/REPO/releases/download/V0.1/checkpoint.zip) of those files from the GitHub Release page.
47+
48+
4549
Run the following Jupyter notebook locally.
4650
```
4751
local_inference_test.ipynb
4852
```
53+
# [How to run TensorFlow object detection model faster with Intel Graphics](https://www.dlology.com/blog/how-to-run-tensorflow-object-detection-model-faster-with-intel-graphics/) | DLology Blog
4954

5055
## How to deploy the trained custom object detection model with OpenVINO
5156

5257
Requirements:
5358
- Frozen TensorFlow object detection model. i.e. `frozen_inference_graph.pb` downloaded from Colab after training.
54-
- The modified pipline config file used for training. Also downloaded from Colab after training.
59+
- The modified pipeline config file used for training. Also downloaded from Colab after training.
60+
61+
You can also opt to download my [copy](https://github.com/Tony607/REPO/releases/download/V0.1/checkpoint.zip) of those files from the GitHub Release page.
5562

5663
Run the following Jupyter notebook locally and follow the instructions in side.
5764
```
5865
deploy/openvino_convert_tf_object_detection.ipynb
59-
```
66+
```
67+
## Run the benchmark
68+
69+
Examples
70+
71+
Benchmark SSD mobileNet V2 on GPU with FP16 quantized weights.
72+
```
73+
cd ./deploy
74+
python openvino_inference_benchmark.py\
75+
--model-dir ./models/ssd_mobilenet_v2_custom_trained/FP16\
76+
--device GPU\
77+
--data-type FP16\
78+
--img ../test/15.jpg
79+
```
80+
TensorFlow benchmark on cpu
81+
```
82+
python local_inference_test.py\
83+
--model ./models/frozen_inference_graph.pb\
84+
--img ./test/15.jpg\
85+
--cpu
86+
```

deploy/openvino_convert_tf_object_detection.ipynb

Lines changed: 75 additions & 58 deletions
Large diffs are not rendered by default.
Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
"""
2+
## Example to benchmark SSD mobileNet V2 on Neural Compute stick.
3+
```
4+
python openvino_inference_benchmark.py\
5+
--model-dir ./models/ssd_mobilenet_v2_custom_trained/FP16\
6+
--device MYRIAD\
7+
--data-type FP16\
8+
--img ../test/15.jpg
9+
```
10+
"""
11+
12+
import os
13+
import time
14+
import glob
15+
import platform
16+
from PIL import Image
17+
import numpy as np
18+
19+
# Check path like C:\Intel\computer_vision_sdk\python\python3.5 or ~/intel/computer_vision_sdk/python/python3.5 exists in PYTHONPATH.
20+
is_win = "windows" in platform.platform().lower()
21+
if is_win:
22+
message = "Please run `C:\\Intel\\computer_vision_sdk\\bin\\setupvars.bat` before running this."
23+
else:
24+
message = "Add the following line to ~/.bashrc and re-run.\nsource ~/intel/computer_vision_sdk/bin/setupvars.sh"
25+
26+
assert "computer_vision_sdk" in os.environ["PYTHONPATH"], message
27+
28+
29+
try:
30+
from openvino import inference_engine as ie
31+
from openvino.inference_engine import IENetwork, IEPlugin
32+
except Exception as e:
33+
exception_type = type(e).__name__
34+
print(
35+
"The following error happened while importing Python API module:\n[ {} ] {}".format(
36+
exception_type, e
37+
)
38+
)
39+
sys.exit(1)
40+
41+
42+
def pre_process_image(imagePath, img_shape):
43+
"""pre process an image from image path.
44+
45+
Arguments:
46+
imagePath {str} -- input image file path.
47+
img_shape {tuple} -- Target height and width as a tuple.
48+
49+
Returns:
50+
np.array -- Preprocessed image.
51+
"""
52+
53+
# Model input format
54+
assert isinstance(img_shape, tuple) and len(img_shape) == 2
55+
56+
n, c, h, w = [1, 3, img_shape[0], img_shape[1]]
57+
image = Image.open(imagePath)
58+
processed_img = image.resize((h, w), resample=Image.BILINEAR)
59+
60+
processed_img = np.array(processed_img).astype(np.uint8)
61+
62+
# Change data layout from HWC to CHW
63+
processed_img = processed_img.transpose((2, 0, 1))
64+
processed_img = processed_img.reshape((n, c, h, w))
65+
66+
return processed_img, np.array(image)
67+
68+
69+
if __name__ == "__main__":
70+
import argparse
71+
72+
# python argparse_test.py 5 -v --color RED
73+
parser = argparse.ArgumentParser(description="OpenVINO Inference speed benchmark.")
74+
# parser.add_argument("-v", "--verbose", help="increase output verbosity",
75+
# action="store_true")
76+
parser.add_argument(
77+
"--model-dir",
78+
help="Directory where the OpenVINO IR .xml and .bin files exist.",
79+
type=str,
80+
)
81+
parser.add_argument(
82+
"--device", help="Device to run inference: GPU, CPU or MYRIAD", type=str
83+
)
84+
parser.add_argument(
85+
"--data-type", help="Input image file path.", type=str, default=None
86+
)
87+
parser.add_argument("--img", help="Path to a sample image to inference.", type=str)
88+
args = parser.parse_args()
89+
90+
# Directory to model xml and bin files.
91+
output_dir = args.model_dir
92+
assert os.path.isdir(output_dir), "`{}` does not exist".format(output_dir)
93+
94+
# Devices: GPU (intel), CPU or MYRIAD
95+
plugin_device = args.device
96+
data_type = args.data_type
97+
# Converted model take fixed size image as input,
98+
# we simply use same size for image width and height.
99+
img_height = 300
100+
101+
DATA_TYPE_MAP = {"GPU": "FP16", "CPU": "FP32", "MYRIAD": "FP16"}
102+
assert (
103+
plugin_device in DATA_TYPE_MAP
104+
), "Unsupported device: `{}`, not found in `{}`".format(
105+
plugin_device, list(DATA_TYPE_MAP.keys())
106+
)
107+
108+
if data_type is None:
109+
data_type = DATA_TYPE_MAP.get(plugin_device)
110+
111+
# Path to a sample image to inference.
112+
img_fname = args.img
113+
assert os.path.isfile(img_fname)
114+
115+
# Plugin initialization for specified device and load extensions library if specified.
116+
plugin_dir = None
117+
model_xml = glob.glob(os.path.join(output_dir, "*.xml"))[-1]
118+
model_bin = glob.glob(os.path.join(output_dir, "*.bin"))[-1]
119+
# Devices: GPU (intel), CPU, MYRIAD
120+
plugin = IEPlugin(plugin_device, plugin_dirs=plugin_dir)
121+
# Read IR
122+
net = IENetwork(model=model_xml, weights=model_bin)
123+
assert len(net.inputs.keys()) == 1
124+
assert len(net.outputs) == 1
125+
input_blob = next(iter(net.inputs))
126+
out_blob = next(iter(net.outputs))
127+
# Load network to the plugin
128+
exec_net = plugin.load(network=net)
129+
del net
130+
131+
# Run inference
132+
img_shape = (img_height, img_height)
133+
processed_img, image = pre_process_image(img_fname, img_shape)
134+
res = exec_net.infer(inputs={input_blob: processed_img})
135+
136+
print(res["DetectionOutput"].shape)
137+
138+
probability_threshold = 0.5
139+
preds = [
140+
pred for pred in res["DetectionOutput"][0][0] if pred[2] > probability_threshold
141+
]
142+
143+
for pred in preds:
144+
class_label = pred[1]
145+
probability = pred[2]
146+
print(
147+
"Predict class label:{}, with probability: {}".format(
148+
class_label, probability
149+
)
150+
)
151+
152+
times = []
153+
for i in range(20):
154+
start_time = time.time()
155+
res = exec_net.infer(inputs={input_blob: processed_img})
156+
delta = time.time() - start_time
157+
times.append(delta)
158+
mean_delta = np.array(times).mean()
159+
fps = 1 / mean_delta
160+
print('average(sec):{:.3f},fps:{:.2f}'.format(mean_delta,fps))

local_inference_test.ipynb

Lines changed: 130 additions & 7 deletions
Large diffs are not rendered by default.

local_inference_test.py

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
#!/usr/bin/env python
2+
# coding: utf-8
3+
4+
import os
5+
import glob
6+
import time
7+
import numpy as np
8+
import os
9+
import six.moves.urllib as urllib
10+
import sys
11+
import tarfile
12+
import tensorflow as tf
13+
import zipfile
14+
15+
from collections import defaultdict
16+
from io import StringIO
17+
from PIL import Image
18+
from object_detection.utils import ops as utils_ops
19+
20+
21+
if __name__ == "__main__":
22+
import argparse
23+
24+
# python argparse_test.py 5 -v --color RED
25+
parser = argparse.ArgumentParser(
26+
description="TensorFlow Inference speed benchmark for object detection model."
27+
)
28+
# parser.add_argument("-v", "--verbose", help="increase output verbosity",
29+
# action="store_true")
30+
parser.add_argument(
31+
"--model",
32+
help="Path to the frozen graph .pb file.",
33+
type=str,
34+
default="./models/frozen_inference_graph.pb",
35+
)
36+
37+
parser.add_argument(
38+
"--cpu", help="Force to use CPU during inference.", action="store_true"
39+
)
40+
parser.add_argument("--img", help="Path to a sample image to inference.", type=str)
41+
args = parser.parse_args()
42+
43+
# Path to frozen detection graph. This is the actual model that is used for the object detection.
44+
PATH_TO_CKPT = args.model
45+
46+
image_path = args.img
47+
48+
assert os.path.isfile(PATH_TO_CKPT)
49+
assert os.path.isfile(image_path)
50+
51+
detection_graph = tf.Graph()
52+
with detection_graph.as_default():
53+
od_graph_def = tf.GraphDef()
54+
with tf.gfile.GFile(PATH_TO_CKPT, "rb") as fid:
55+
serialized_graph = fid.read()
56+
od_graph_def.ParseFromString(serialized_graph)
57+
tf.import_graph_def(od_graph_def, name="")
58+
59+
def load_image_into_numpy_array(image):
60+
(im_width, im_height) = image.size
61+
return (
62+
np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
63+
)
64+
65+
def run_inference_benchmark(image, graph, trial=20, gpu=True):
66+
"""Run TensorFlow inference benchmark.
67+
68+
Arguments:
69+
image {np.array} -- Input image as an Numpy array.
70+
graph {tf.Graph} -- TensorFlow graph object.
71+
72+
Keyword Arguments:
73+
trial {int} -- Number of inference to run for averaging. (default: {20})
74+
gpu {bool} -- Use Nvidia GPU when available. (default: {True})
75+
76+
Returns:
77+
int -- Frame per seconds benchmark result.
78+
"""
79+
80+
with graph.as_default():
81+
if gpu:
82+
config = tf.ConfigProto()
83+
else:
84+
config = tf.ConfigProto(device_count={"GPU": 0})
85+
with tf.Session(config=config) as sess:
86+
# Get handles to input and output tensors
87+
ops = tf.get_default_graph().get_operations()
88+
all_tensor_names = {output.name for op in ops for output in op.outputs}
89+
tensor_dict = {}
90+
for key in [
91+
"num_detections",
92+
"detection_boxes",
93+
"detection_scores",
94+
"detection_classes",
95+
"detection_masks",
96+
]:
97+
tensor_name = key + ":0"
98+
if tensor_name in all_tensor_names:
99+
tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
100+
tensor_name
101+
)
102+
if "detection_masks" in tensor_dict:
103+
# The following processing is only for single image
104+
detection_boxes = tf.squeeze(tensor_dict["detection_boxes"], [0])
105+
detection_masks = tf.squeeze(tensor_dict["detection_masks"], [0])
106+
# Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
107+
real_num_detection = tf.cast(
108+
tensor_dict["num_detections"][0], tf.int32
109+
)
110+
detection_boxes = tf.slice(
111+
detection_boxes, [0, 0], [real_num_detection, -1]
112+
)
113+
detection_masks = tf.slice(
114+
detection_masks, [0, 0, 0], [real_num_detection, -1, -1]
115+
)
116+
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
117+
detection_masks, detection_boxes, image.shape[0], image.shape[1]
118+
)
119+
detection_masks_reframed = tf.cast(
120+
tf.greater(detection_masks_reframed, 0.5), tf.uint8
121+
)
122+
# Follow the convention by adding back the batch dimension
123+
tensor_dict["detection_masks"] = tf.expand_dims(
124+
detection_masks_reframed, 0
125+
)
126+
image_tensor = tf.get_default_graph().get_tensor_by_name(
127+
"image_tensor:0"
128+
)
129+
130+
# Run inference
131+
times = []
132+
# Kick start the first inference which takes longer and followings.
133+
output_dict = sess.run(
134+
tensor_dict, feed_dict={image_tensor: np.expand_dims(image, 0)}
135+
)
136+
for i in range(trial):
137+
start_time = time.time()
138+
output_dict = sess.run(
139+
tensor_dict, feed_dict={image_tensor: np.expand_dims(image, 0)}
140+
)
141+
delta = time.time() - start_time
142+
times.append(delta)
143+
mean_delta = np.array(times).mean()
144+
fps = 1 / mean_delta
145+
print("average(sec):{:.3f},fps:{:.2f}".format(mean_delta, fps))
146+
147+
return fps
148+
149+
image = Image.open(image_path)
150+
# the array based representation of the image will be used later in order to prepare the
151+
# result image with boxes and labels on it.
152+
image_np = load_image_into_numpy_array(image)
153+
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
154+
image_np_expanded = np.expand_dims(image_np, axis=0)
155+
# Actual detection benchmark.
156+
fps = run_inference_benchmark(image_np, detection_graph, trial=20, gpu=not args.cpu)

0 commit comments

Comments
 (0)