Skip to content

Commit 7c1677f

Browse files
Merge pull request #1744 from eizamaliev/model_runner
Unify Python's object detection demos
2 parents 2390944 + bf61611 commit 7c1677f

File tree

30 files changed

+1313
-2200
lines changed

30 files changed

+1313
-2200
lines changed

demos/README.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,8 @@ The Open Model Zoo includes the following demos:
2727
- [Monodepth Python\* Demo](./python_demos/monodepth_demo/README.md) - The demo demonstrates how to run monocular depth estimation models.
2828
- [Multi-Camera Multi-Target Tracking Python\* Demo](./python_demos/multi_camera_multi_target_tracking/README.md) Demo application for multiple targets (persons or vehicles) tracking on multiple cameras.
2929
- [Multi-Channel C++ Demos](./multi_channel/README.md) - Several demo applications for multi-channel scenarios.
30-
- [Object Detection for CenterNet Python\* Demo](./python_demos/object_detection_demo_centernet/README.md) - Demo application for CenterNet object detection network.
30+
- [Object Detection Python\* Demo](./python_demos/object_detection_demo/README.md) - Demo application for several object detection model types (like SSD, Yolo, etc).
3131
- [Object Detection for Faster R-CNN C++ Demo](./object_detection_demo_faster_rcnn/README.md) - Inference of object detection networks like Faster R-CNN (the demo supports only images as inputs).
32-
- [Object Detection for RetinaFace Python\* Demo](./python_demos/object_detection_demo_retinaface/README.md) - Demo application for RetinaFace face detection model.
3332
- [Object Detection C++ Demo](./object_detection_demo/README.md) - Demo application for Object Detection networks (different models architectures are supproted), async API showcase, simple OpenCV interoperability (supports video and camera inputs).
3433
- [Pedestrian Tracker C++ Demo](./pedestrian_tracker_demo/README.md) - Demo application for pedestrian tracking scenario.
3534
- [Security Barrier Camera C++ Demo](./security_barrier_camera_demo/README.md) - Vehicle Detection followed by the Vehicle Attributes and License-Plate Recognition, supports images/video and camera inputs.
@@ -71,7 +70,7 @@ The table below shows the correlation between models, demos, and supported plugi
7170
| person-reidentification-retail-0079 | [Crossroad Camera Demo](./crossroad_camera_demo/README.md)<br>[Multi-Camera Multi-Target Tracking Demo](./python_demos/multi_camera_multi_target_tracking/README.md) | Supported | Supported | Supported | Supported |
7271
| person-vehicle-bike-detection-crossroad-0078 | [Crossroad Camera Demo](./crossroad_camera_demo/README.md) | Supported | Supported | Supported | Supported |
7372
| person-vehicle-bike-detection-crossroad-1016 | [Crossroad Camera Demo](./crossroad_camera_demo/README.md) | Supported | | | |
74-
| person-vehicle-bike-detection-crossroad-yolov3-1020 | [Object Detection for YOLO V3 Python\* Demo](./python_demos/object_detection_demo_yolov3_async/README.md) | Supported | | | |
73+
| person-vehicle-bike-detection-crossroad-yolov3-1020 | [Object Detection Python\* Demo](./python_demos/object_detection_demo/README.md) | Supported | | | |
7574
| human-pose-estimation-0001 | [Human Pose Estimation Demo](./human_pose_estimation_demo/README.md)<br>[Human Pose Estimation Python\* Demo](./python_demos/human_pose_estimation_demo/README.md) | Supported | Supported | Supported | Supported |
7675
| human-pose-estimation-0002 | [Human Pose Estimation Python\* Demo](./python_demos/human_pose_estimation_demo/README.md) | Supported | Supported | | |
7776
| human-pose-estimation-0003 | [Human Pose Estimation Python\* Demo](./python_demos/human_pose_estimation_demo/README.md) | Supported | Supported | | |
@@ -116,7 +115,7 @@ The table below shows the correlation between models, demos, and supported plugi
116115
| road-segmentation-adas-0001 | any demo that supports SSD\*-based models, above | Supported | Supported | Supported | Supported |
117116
| vehicle-detection-adas-binary-0001 | any demo that supports SSD\*-based models, above | Supported | Supported | | |
118117
| vehicle-detection-adas-0002 | any demo that supports SSD\*-based models, above | Supported | Supported | Supported | Supported |
119-
| yolo-v2-tiny-vehicle-detection-0001 | [Object Detection for YOLO V3 Python\* Demo](./python_demos/object_detection_demo_yolov3_async/README.md) | Supported | | | |
118+
| yolo-v2-tiny-vehicle-detection-0001 | [Object Detection Python\* Demo](./python_demos/object_detection_demo/README.md) | Supported | | | |
120119

121120
Notice that the FPGA support comes through a [heterogeneous execution](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_supported_plugins_HETERO.html), for example, when the post-processing is happening on the CPU.
122121

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
"""
2+
Copyright (C) 2020 Intel Corporation
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
"""
16+
17+
18+
from .ssd import SSD
19+
from .yolo import YOLO
20+
from .faceboxes import FaceBoxes
21+
from .centernet import CenterNet
22+
from .retinaface import RetinaFace

demos/python_demos/object_detection_demo_centernet/detector.py renamed to demos/python_demos/common/models/centernet.py

Lines changed: 73 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
Copyright (c) 2019 Intel Corporation
2+
Copyright (c) 2019-2020 Intel Corporation
33
44
Licensed under the Apache License, Version 2.0 (the "License");
55
you may not use this file except in compliance with the License.
@@ -14,28 +14,80 @@
1414
limitations under the License.
1515
"""
1616

17-
import os
1817
import cv2
1918
import numpy as np
2019
from numpy.lib.stride_tricks import as_strided
2120

21+
from .model import Model
22+
from .utils import Detection, load_labels
2223

23-
class Detector(object):
24-
def __init__(self, ie, model_path, threshold=0.3, device='CPU'):
25-
model = ie.read_network(model_path, os.path.splitext(model_path)[0] + '.bin')
2624

27-
assert len(model.input_info) == 1, "Expected 1 input blob"
28-
assert len(model.outputs) == 3, "Expected 3 output blobs"
25+
class CenterNet(Model):
26+
def __init__(self, ie, model_path, labels=None, threshold=0.3):
27+
super().__init__(ie, model_path)
2928

30-
self._input_layer_name = next(iter(model.input_info))
31-
self._output_layer_names = sorted(model.outputs)
29+
assert len(self.net.input_info) == 1, "Expected 1 input blob"
30+
assert len(self.net.outputs) == 3, "Expected 3 output blobs"
31+
32+
if isinstance(labels, (list, tuple)):
33+
self.labels = labels
34+
else:
35+
self.labels = load_labels(labels) if labels else None
36+
37+
self.image_blob_name = next(iter(self.net.input_info))
38+
self._output_layer_names = sorted(self.net.outputs)
3239

33-
self._ie = ie
34-
self._exec_model = self._ie.load_network(model, device)
3540
self._threshold = threshold
36-
self.infer_time = -1
37-
_, channels, self.input_height, self.input_width = model.input_info[self._input_layer_name].input_data.shape
38-
assert channels == 3, "Expected 3-channel input"
41+
42+
self.n, self.c, self.h, self.w = self.net.input_info[self.image_blob_name].input_data.shape
43+
assert self.c == 3, "Expected 3-channel input"
44+
45+
def preprocess(self, inputs):
46+
image = inputs
47+
meta = {'original_shape': image.shape}
48+
49+
height, width = image.shape[0:2]
50+
center = np.array([width / 2., height / 2.], dtype=np.float32)
51+
scale = max(height, width)
52+
trans_input = self.get_affine_transform(center, scale, 0, [self.w, self.h])
53+
resized_image = cv2.warpAffine(image, trans_input, (self.w, self.h), flags=cv2.INTER_LINEAR)
54+
resized_image = np.transpose(resized_image, (2, 0, 1))
55+
56+
dict_inputs = {self.image_blob_name: resized_image}
57+
return dict_inputs, meta
58+
59+
def postprocess(self, outputs, meta):
60+
heat = outputs[self._output_layer_names[0]][0]
61+
reg = outputs[self._output_layer_names[1]][0]
62+
wh = outputs[self._output_layer_names[2]][0]
63+
heat = np.exp(heat)/(1 + np.exp(heat))
64+
height, width = heat.shape[1:3]
65+
num_predictions = 100
66+
67+
heat = self._nms(heat)
68+
scores, inds, clses, ys, xs = self._topk(heat, K=num_predictions)
69+
reg = self._tranpose_and_gather_feat(reg, inds)
70+
71+
reg = reg.reshape((num_predictions, 2))
72+
xs = xs.reshape((num_predictions, 1)) + reg[:, 0:1]
73+
ys = ys.reshape((num_predictions, 1)) + reg[:, 1:2]
74+
75+
wh = self._tranpose_and_gather_feat(wh, inds)
76+
wh = wh.reshape((num_predictions, 2))
77+
clses = clses.reshape((num_predictions, 1))
78+
scores = scores.reshape((num_predictions, 1))
79+
bboxes = np.concatenate((xs - wh[..., 0:1] / 2,
80+
ys - wh[..., 1:2] / 2,
81+
xs + wh[..., 0:1] / 2,
82+
ys + wh[..., 1:2] / 2), axis=1)
83+
detections = np.concatenate((bboxes, scores, clses), axis=1)
84+
mask = detections[..., 4] >= self._threshold
85+
filtered_detections = detections[mask]
86+
scale = max(meta['original_shape'])
87+
center = np.array(meta['original_shape'][:2])/2.0
88+
dets = self._transform(filtered_detections, np.flip(center, 0), scale, height, width)
89+
dets = [Detection(x[0], x[1], x[2], x[3], score=x[4], id=x[5]) for x in dets]
90+
return dets
3991

4092
@staticmethod
4193
def get_affine_transform(center, scale, rot, output_size, inv=False):
@@ -89,7 +141,7 @@ def _gather_feat(feat, ind):
89141
def _tranpose_and_gather_feat(feat, ind):
90142
feat = np.transpose(feat, (1, 2, 0))
91143
feat = feat.reshape((-1, feat.shape[2]))
92-
feat = Detector._gather_feat(feat, ind)
144+
feat = CenterNet._gather_feat(feat, ind)
93145
return feat
94146

95147
@staticmethod
@@ -107,10 +159,10 @@ def _topk(scores, K=40):
107159
topk_ind = np.argpartition(topk_scores, -K)[-K:]
108160
topk_score = topk_scores[topk_ind]
109161
topk_clses = topk_ind / K
110-
topk_inds = Detector._gather_feat(
162+
topk_inds = CenterNet._gather_feat(
111163
topk_inds.reshape((-1, 1)), topk_ind).reshape((K))
112-
topk_ys = Detector._gather_feat(topk_ys.reshape((-1, 1)), topk_ind).reshape((K))
113-
topk_xs = Detector._gather_feat(topk_xs.reshape((-1, 1)), topk_ind).reshape((K))
164+
topk_ys = CenterNet._gather_feat(topk_ys.reshape((-1, 1)), topk_ind).reshape((K))
165+
topk_xs = CenterNet._gather_feat(topk_xs.reshape((-1, 1)), topk_ind).reshape((K))
114166

115167
return topk_score, topk_inds, topk_clses, topk_ys, topk_xs
116168

@@ -142,72 +194,15 @@ def affine_transform(pt, t):
142194
return new_pt[:2]
143195

144196
target_coords = np.zeros(coords.shape)
145-
trans = Detector.get_affine_transform(center, scale, 0, output_size, inv=True)
197+
trans = CenterNet.get_affine_transform(center, scale, 0, output_size, inv=True)
146198
for p in range(coords.shape[0]):
147199
target_coords[p, 0:2] = affine_transform(coords[p, 0:2], trans)
148200
return target_coords
149201

150202
@staticmethod
151203
def _transform(dets, center, scale, height, width):
152-
dets[:, :2] = Detector._transform_preds(
204+
dets[:, :2] = CenterNet._transform_preds(
153205
dets[:, 0:2], center, scale, (width, height))
154-
dets[:, 2:4] = Detector._transform_preds(
206+
dets[:, 2:4] = CenterNet._transform_preds(
155207
dets[:, 2:4], center, scale, (width, height))
156208
return dets
157-
158-
def preprocess(self, image):
159-
height, width = image.shape[0:2]
160-
center = np.array([width / 2., height / 2.], dtype=np.float32)
161-
scale = max(height, width)
162-
163-
trans_input = self.get_affine_transform(center, scale, 0, [self.input_width, self.input_height])
164-
resized_image = cv2.resize(image, (width, height))
165-
inp_image = cv2.warpAffine(
166-
resized_image, trans_input, (self.input_width, self.input_height),
167-
flags=cv2.INTER_LINEAR)
168-
169-
return inp_image
170-
171-
def postprocess(self, raw_output, image_sizes):
172-
heat, reg, wh = raw_output
173-
heat = heat = np.exp(heat)/(1 + np.exp(heat))
174-
height, width = heat.shape[1:3]
175-
num_predictions = 100
176-
177-
heat = self._nms(heat)
178-
scores, inds, clses, ys, xs = self._topk(heat, K=num_predictions)
179-
reg = self._tranpose_and_gather_feat(reg, inds)
180-
181-
reg = reg.reshape((num_predictions, 2))
182-
xs = xs.reshape((num_predictions, 1)) + reg[:, 0:1]
183-
ys = ys.reshape((num_predictions, 1)) + reg[:, 1:2]
184-
185-
wh = self._tranpose_and_gather_feat(wh, inds)
186-
wh = wh.reshape((num_predictions, 2))
187-
clses = clses.reshape((num_predictions, 1))
188-
scores = scores.reshape((num_predictions, 1))
189-
bboxes = np.concatenate((xs - wh[..., 0:1] / 2,
190-
ys - wh[..., 1:2] / 2,
191-
xs + wh[..., 0:1] / 2,
192-
ys + wh[..., 1:2] / 2), axis=1)
193-
detections = np.concatenate((bboxes, scores, clses), axis=1)
194-
mask = detections[..., 4] >= self._threshold
195-
filtered_detections = detections[mask]
196-
scale = max(image_sizes)
197-
center = np.array(image_sizes[:2])/2.0
198-
dets = self._transform(filtered_detections, np.flip(center, 0), scale, height, width)
199-
return dets
200-
201-
def infer(self, image):
202-
t0 = cv2.getTickCount()
203-
output = self._exec_model.infer(inputs={self._input_layer_name: image})
204-
self.infer_time = (cv2.getTickCount() - t0) / cv2.getTickFrequency()
205-
return output
206-
207-
def detect(self, image):
208-
image_sizes = image.shape[:2]
209-
image = self.preprocess(image)
210-
image = np.transpose(image, (2, 0, 1))
211-
output = self.infer(image)
212-
detections = self.postprocess([output[name][0] for name in self._output_layer_names], image_sizes)
213-
return detections

0 commit comments

Comments
 (0)