You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/computer-vision/faq.yml
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ metadata:
9
9
10
10
ms.service: azure-ai-vision
11
11
ms.topic: faq
12
-
ms.date: 05/09/2022
12
+
ms.date: 02/27/2024
13
13
ms.author: pafarley
14
14
ms.custom: cogserv-non-critical-vision
15
15
title: Azure AI Vision API Frequently Asked Questions
@@ -52,4 +52,4 @@ sections:
52
52
- question: |
53
53
Can I train Azure AI Vision API to use custom tags? For example, I would like to feed in pictures of cat breeds to 'train' the AI, then receive the breed value on an AI request.
54
54
answer: |
55
-
This function is currently not available. You can use [Custom Vision](../custom-vision-service/overview.md) to train a model to detect user-defined visual features.
55
+
Yes. See [Model customization](/azure/ai-services/computer-vision/concept-model-customization), a feature of Image Analysis 4.0.
This article demonstrates how to perform near real-time analysis on frames that are taken from a live video stream by using the Azure AI Vision API. The basic elements of such an analysis are:
18
+
This article demonstrates how to use the Azure AI Vision API to perform near real-time analysis on frames that are taken from a live video stream. The basic elements of such an analysis are:
19
19
20
-
- Acquiring frames from a video source.
21
-
- Selecting which frames to analyze.
22
-
- Submitting these frames to the API.
23
-
- Consuming each analysis result that's returned from the API call.
20
+
- Acquiring frames from a video source
21
+
- Selecting which frames to analyze
22
+
- Submitting these frames to the API
23
+
- Consuming each analysis result that's returned from the API call
24
24
25
-
The samples in this article are written in C#. To access the code, go to the [Video frame analysis sample](https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis/) page on GitHub.
25
+
> [!TIP]
26
+
> The samples in this article are written in C#. To access the code, go to the [Video frame analysis sample](https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis/) page on GitHub.
26
27
27
28
## Approaches to running near real-time analysis
28
29
29
-
You can solve the problem of running near real-time analysis on video streams by using a variety of approaches. This article outlines three of them, in increasing levels of sophistication.
30
+
You can solve the problem of running near real-time analysis on video streams using a variety of approaches. This article outlines three of them, in increasing levels of sophistication.
30
31
31
-
### Design an infinite loop
32
+
### Method 1: Design an infinite loop
32
33
33
-
The simplest design for near real-time analysis is an infinite loop. In each iteration of this loop, you grab a frame, analyze it, and then consume the result:
34
+
The simplest design for near real-time analysis is an infinite loop. In each iteration of this loop, the application retrieves a frame, analyzes it, and then processes the result:
34
35
35
36
```csharp
36
37
while (true)
@@ -46,7 +47,7 @@ while (true)
46
47
47
48
If your analysis were to consist of a lightweight, client-side algorithm, this approach would be suitable. However, when the analysis occurs in the cloud, the resulting latency means that an API call might take several seconds. During this time, you're not capturing images, and your thread is essentially doing nothing. Your maximum frame rate is limited by the latency of the API calls.
48
49
49
-
### Allow the API calls to run in parallel
50
+
### Method 2: Allow the API calls to run in parallel
50
51
51
52
Although a simple, single-threaded loop makes sense for a lightweight, client-side algorithm, it doesn't fit well with the latency of a cloud API call. The solution to this problem is to allow the long-running API call to run in parallel with the frame-grabbing. In C#, you could do this by using task-based parallelism. For example, you can run the following code:
52
53
@@ -70,9 +71,9 @@ With this approach, you launch each analysis in a separate task. The task can ru
*Finally, thissimplecodedoesn't keep track of the tasks that get created, so exceptions silently disappear. Thus, you need to add a "consumer" thread that tracks the analysis tasks, raises exceptions, kills long-running tasks, and ensures that the results get consumed in the correct order, one at a time.
72
73
73
-
### Design a producer-consumer system
74
+
### Method 3: Design a producer-consumer system
74
75
75
-
Foryourfinalapproach, designinga"producer-consumer"system, youbuildaproducerthreadthatlookssimilartoyourpreviouslymentionedinfiniteloop. However, insteadofconsumingtheanalysisresultsassoonasthey're available, the producer simply places the tasks in a queue to keep track of them.
76
+
Todesigna"producer-consumer"system, youbuildaproducerthreadthatlookssimilartotheprevioussection's infinite loop. Then, instead of consuming the analysis results as soon as they'reavailable, theproducersimplyplacesthetasksinaqueuetokeeptrackofthem.
Tohelpgetyourappupandrunningasquicklyaspossible, we've implemented the system that'sdescribedintheprecedingsection. It's intended to be flexible enough to accommodate many scenarios, while being easy to use. To access the code, go to the [Video frame analysis sample](https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis/) page on GitHub.
141
+
Tohelpgetyourappupandrunningasquicklyaspossible, we've implemented the system that'sdescribedintheprevioussection. It's intended to be flexible enough to accommodate many scenarios, while being easy to use. To access the code, go to the [Video frame analysis sample](https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis/) repo on GitHub.
Inmostmodes, there's a visible delay between the live video on the left and the visualized analysis on the right. This delay is the time that it takes to make the API call. An exception is in the "EmotionsWithClientFaceDetect" mode, which performs face detection locally on the client computer by using OpenCV before it submits any images to Azure AI services.
221
+
Inmostmodes, there's a visible delay between the live video on the left and the visualized analysis on the right. This delay is the time that it takes to make the API call. An exception is in the `EmotionsWithClientFaceDetect` mode, which performs face detection locally on the client computer by using OpenCV before it submits any images to Azure AI services.
Copy file name to clipboardExpand all lines: articles/ai-services/computer-vision/how-to/call-read-api.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ manager: nitinme
8
8
9
9
ms.service: azure-ai-vision
10
10
ms.topic: how-to
11
-
ms.date: 11/03/2022
11
+
ms.date: 02/27/2024
12
12
ms.author: pafarley
13
13
---
14
14
@@ -20,7 +20,7 @@ In this guide, you'll learn how to call the v3.2 GA Read API to extract text fro
20
20
21
21
## Input requirements
22
22
23
-
The **Read** call takes images and documents as its input. They have the following requirements:
23
+
The **Read**API call takes images and documents as its input. They have the following requirements:
24
24
25
25
* Supported file formats: JPEG, PNG, BMP, PDF, and TIFF
26
26
* For PDF and TIFF files, up to 2000 pages (only the first two pages for the free tier) are processed.
@@ -57,7 +57,7 @@ By default, the service outputs the text lines in the left to right order. Optio
57
57
58
58
:::image type="content" source="../Images/ocr-reading-order-example.png" alt-text="OCR Reading order example" border="true" :::
59
59
60
-
### Select page(s) or page ranges for text extraction
60
+
### Select page(s) or page range(s) for text extraction
61
61
62
62
By default, the service extracts text from all pages in the documents. Optionally, use the `pages` request parameter to specify page numbers or page ranges to extract text from only those pages. The following example shows a document with 10 pages, with text extracted for both cases - all pages (1-10) and selected pages (3-6).
63
63
@@ -106,7 +106,7 @@ You call this operation iteratively until it returns with the **succeeded** valu
106
106
When the **status** field has the `succeeded` value, the JSON response contains the extracted text content from your image or document. The JSON response maintains the original line groupings of recognized words. It includes the extracted text lines and their bounding box coordinates. Each text line includes all extracted words with their coordinates and confidence scores.
107
107
108
108
> [!NOTE]
109
-
> The data submitted to the `Read` operation are temporarily encrypted and stored at rest for a short duration, and then deleted. This lets your applications retrieve the extracted text as part of the service response.
109
+
> The data submitted to the **Read** operation are temporarily encrypted and stored at rest for a short duration, and then deleted. This lets your applications retrieve the extracted text as part of the service response.
110
110
111
111
### Sample JSON output
112
112
@@ -185,11 +185,11 @@ See the following example of a successful JSON response:
185
185
186
186
### Handwritten classification for text lines (Latin languages only)
187
187
188
-
The response includes classifying whether each text line is of handwriting style or not, along with a confidence score. This feature is only supported for Latin languages. The following example shows the handwritten classification for the text in the image.
188
+
The response includes a classification of whether each line of text is in handwritten style or not, along with a confidence score. This feature is only available for Latin languages. The following example shows the handwritten classification for the text in the image.
- Get started with the [OCR (Read) REST API or client library quickstarts](../quickstarts-sdk/client-library.md).
195
-
-Learn about the [Read 3.2 REST API](https://westus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-2/operations/5d986960601faab4bf452005).
195
+
-[Read 3.2 REST API reference](https://westus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-2/operations/5d986960601faab4bf452005).
0 commit comments