Skip to content

Commit abec388

Browse files
Merge pull request #252367 from IngridAtMicrosoft/objects
Object detection
2 parents 2a1c3d2 + 42dcc98 commit abec388

File tree

6 files changed

+217
-1
lines changed

6 files changed

+217
-1
lines changed
218 KB
Loading
535 KB
Loading
443 KB
Loading
689 KB
Loading
Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
---
2+
title: Azure AI Video Indexer object detection overview
3+
description: An introduction to Azure AI Video Indexer object detection overview.
4+
ms.service: azure-video-indexer
5+
ms.date: 09/26/2023
6+
ms.topic: article
7+
ms.author: inhenkel
8+
author: IngridAtMicrosoft
9+
---
10+
11+
# Azure Video Indexer object detection
12+
13+
Azure Video Indexer can detect objects in videos. The insight is part of all standard and advanced presets.
14+
15+
## Prerequisites
16+
17+
Review [transparency note overview](/legal/azure-video-indexer/transparency-note?context=/azure/azure-video-indexer/context/context)
18+
19+
## JSON keys and definitions
20+
21+
| **Key** | **Definition** |
22+
| --- | --- |
23+
| ID | Incremental number of IDs of the detected objects in the media file |
24+
| Type | Type of objects, for example, Car
25+
| ThumbnailID | GUID representing a single detection of the object |
26+
| displayName | Name to be displayed in the VI portal experience |
27+
| WikiDataID | A unique identifier in the WikiData structure |
28+
| Instances | List of all instances that were tracked
29+
| Confidence | A score between 0-1 indicating the object detection confidence |
30+
| adjustedStart | adjusted start time of the video when using the editor |
31+
| adjustedEnd | adjusted end time of the video when using the editor |
32+
| start | the time that the object appears in the frame |
33+
| end | the time that the object no longer appears in the frame |
34+
35+
## JSON response
36+
37+
Object detection is included in the insights that are the result of an [Upload](https://api-portal.videoindexer.ai/api-details#api=Operations&operation=Upload-Video) request.
38+
39+
### Detected and tracked objects
40+
41+
Detected and tracked objects appear under “detected Objects” in the downloaded *insights.json* file. Every time a unique object is detected, it's given an ID. That object is also tracked, meaning that the model watches for the detected object to return to the frame. If it does, another instance is added to the instances for the object with different start and end times.
42+
43+
In this example, the first car was detected and given an ID of 1 since it was also the first object detected. Then, a different car was detected and that car was given the ID of 23 since it was the 23rd object detected. Later, the first car appeared again and another instance was added to the JSON. Here is the resulting JSON:
44+
45+
```json
46+
detectedObjects: [
47+
{
48+
id: 1,
49+
type: "Car",
50+
thumbnailId: "1c0b9fbb-6e05-42e3-96c1-abe2cd48t33",
51+
displayName: "car",
52+
wikiDataId: "Q1420",
53+
instances: [
54+
{
55+
confidence: 0.468,
56+
adjustedStart: "0:00:00",
57+
adjustedEnd: "0:00:02.44",
58+
start: "0:00:00",
59+
end: "0:00:02.44"
60+
},
61+
{
62+
confidence: 0.53,
63+
adjustedStart: "0:03:00",
64+
adjustedEnd: "0:00:03.55",
65+
start: "0:03:00",
66+
end: "0:00:03.55"
67+
}
68+
]
69+
},
70+
{
71+
id: 23,
72+
type: "Car",
73+
thumbnailId: "1c0b9fbb-6e05-42e3-96c1-abe2cd48t34",
74+
displayName: "car",
75+
wikiDataId: "Q1420",
76+
instances: [
77+
{
78+
confidence: 0.427,
79+
adjustedStart: "0:00:00",
80+
adjustedEnd: "0:00:14.24",
81+
start: "0:00:00",
82+
end: "0:00:14.24"
83+
}
84+
]
85+
}
86+
]
87+
```
88+
89+
## Try object detection
90+
91+
You can try out object detection with the web portal or with the API.
92+
93+
## [Web Portal](#tab/webportal)
94+
95+
Once you have uploaded a video, you can view the insights. On the insights tab, you can view the list of objects detected and their main instances.
96+
97+
### Insights
98+
Select the **Insights** tab. The objects are in descending order of the number of appearances in the video.
99+
100+
:::image type="content" source="media/object-detection/insights-tab.png" alt-text="screenshot of the interface of the insights tab":::
101+
102+
### Timeline
103+
Select the **Timeline** tab.
104+
105+
:::image type="content" source="media/object-detection/timeline-tab.png" alt-text="screenshot of the interface of the timeline tab":::
106+
107+
Under the timeline tab, all object detection is displayed according to the time of appearance. When you hover over a specific detection, it shows the detection percentage of certainty.
108+
109+
### Player
110+
111+
The player automatically marks the detected object with a bounding box. The selected object from the insights pane is highlighted in blue with the objects type and serial number also displayed.
112+
113+
Filter the bounding boxes around objects by selecting bounding box icon on the player.
114+
115+
:::image type="content" source="media/object-detection/object-filtering-icon.png" alt-text="screenshot of object filtering icon player interface":::
116+
117+
Then, select or deselect the detected objects checkboxes.
118+
119+
:::image type="content" source="media/object-detection/object-filtering.png" alt-text="screenshot of object filtering detected objects in the player interface":::
120+
121+
Download the insights by selecting **Download** and then **Insights (JSON)**.
122+
123+
## [API](#tab/api)
124+
125+
When you use the [Upload](https://api-portal.videoindexer.ai/api-details#api=Operations&operation=Upload-Video) request with the standard or advanced video presets, object detection is included in the indexing.
126+
127+
To examine object detection more thoroughly, use [Get Video Index](https://api-portal.videoindexer.ai/api-details#api=Operations&operation=Get-Video-Index).
128+
129+
---
130+
131+
## Supported objects
132+
133+
:::row:::
134+
:::column:::
135+
- airplane
136+
- apple
137+
- backpack
138+
- banana
139+
- baseball bat
140+
- baseball glove
141+
- bed
142+
- bicycle
143+
- bottle
144+
- bowl
145+
- broccoli
146+
- bus
147+
- cake
148+
:::column-end:::
149+
:::column:::
150+
- car
151+
- carrot
152+
- cell phone
153+
- chair
154+
- clock
155+
- computer mouse
156+
- couch
157+
- cup
158+
- dining table
159+
- donut
160+
- fire hydrant
161+
- fork
162+
- frisbee
163+
:::column-end:::
164+
:::column:::
165+
- handbag
166+
- hot dog
167+
- kite
168+
- knife
169+
- laptop
170+
- microwave
171+
- motorcycle
172+
- necktie
173+
- orange
174+
- oven
175+
- parking meter
176+
- pizza
177+
- potted plant
178+
:::column-end:::
179+
:::column:::
180+
- refrigerator
181+
- remote
182+
- sandwich
183+
- scissors
184+
- skateboard
185+
- skis
186+
- snowboard
187+
- spoon
188+
- sports ball
189+
- suitcase
190+
- surfboard
191+
- teddy bear
192+
- television
193+
:::column-end:::
194+
:::column:::
195+
- tennis racket
196+
- toaster
197+
- toilet
198+
- toothbrush
199+
- traffic light
200+
- train
201+
- umbrella
202+
- vase
203+
- wine glass
204+
:::column-end:::
205+
:::row-end:::
206+
207+
## Limitations
208+
209+
- Up to 20 detections per frame for standard and advanced processing and 35 tracks per class.
210+
- The video area shouldn't exceed 1920 x 1080 pixels.
211+
- Object size shouldn't be greater than 90 percent of the frame.
212+
- A high frame rate (> 30 FPS) may result in slower indexing, with little added value to the quality of the detection and tracking.
213+
- Other factors that may affect the accuracy of the object detection include low light conditions, camera motion, and occlusion.

articles/azure-video-indexer/toc.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
items:
12
- name: Azure AI Video Indexer documentation
23
href: ./index.yml
34
- name: Overview
@@ -60,7 +61,7 @@
6061
- name: Using at scale
6162
href: considerations-when-use-at-scale.md
6263
displayName: 30-GB upload size limitation, six best practices, world wide web, serverless event-driven platform, Azure AI Video Indexer, videoUrl optional parameter, upload video API, HTTP 429 response code, byte array file, callBack URL definition, Media Reserved Units, right indexing parameters, file size limitations, Azure description, Azure Functions, upload speed, videoUrl field, HTTP response, code samples, right parameters, upload request, efficient way, high dependency, service reliability, First consideration, Operations&operation, reliable way, Automatic Scaling, price optimization, price reduction, many cases, next retry, POST notification, use case, different parameters, media file, next request, technological constraints, excess money, business needs, many requests, status change, indexing process, storage account, title, Things, scale, topic, videos, archive, article, questions, smart, choice, information, considerations, 2 GB, issues, performance, files, multi-part, network, connectivity, packets, image, first-consideration, path, location, care, rest, api-portal, videoindexer, details, TIP, example, fast, content, August, media-services, concept, MRUs, AMS, result, Respect, system, capabilities, integration, batch, movies, fact, minute, header, throttling, github, master, flow, decisions, uploading
63-
- name: Insights
64+
- name: Insights overview
6465
displayName: optical character recognition elements, Azure AI Video Indexer account, Azure AI Video Indexer OCR, Azure AI Video Indexer insights, specific predefined text, Detect textual logo, timestamped transcript lines, Textual logo detection, Audio effects detection, insights output file, detailed JSON output, 30+ AI models, Azure portal, account types, audio content, audio cues, video insights, entire video, rich insights, aggregated view, time ranges, common insights, frame(s, representative frames, aesthetic properties, easier consumption, Faces detection, different appearances, Microsoft" logo, sentimentType` field, single audio-file, other insights, Insights tab, right-top corner, response status, response content, Operations&operation, JSON content, insight type, one insight, brief overview, Named entities, Sixteen speakers, Topics inference, title, description, article, data, transcripts, OCRs, emotions, details, example, instances, information, audio-effects-detection-transparency-note, Scenes, shots, keyframes, contrast, stableness, navigation, speech, azure-video-indexer, face-detection-transparency, user, word, Labels, entities-transparency, People, ocr-transparency-note, Sentiments, Transcription, translation, language, features, website, videoindexer, Screenshot, indexer-output, Press, artifacts, API
6566
href: insights-overview.md
6667
items:
@@ -84,6 +85,8 @@
8485
href: topics-inference.md
8586
- name: Text-based emotion detection
8687
href: emotions-detection.md
88+
- name: Object detection
89+
href: object-detection.md
8790
- name: Customizing content models
8891
displayName: customize, customizing
8992
items:

0 commit comments

Comments
 (0)