You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -76,7 +75,7 @@ Available visual features are contained in the `VisualFeatures` enumeration:
76
75
- VisualFeatures.People: Returns the bounding box for detected people
77
76
- VisualFeatures.SmartCrops: Returns the bounding box of the specified aspect ratio for the area of interest
78
77
- VisualFeatures.Read: Extracts readable text
79
-
-
78
+
80
79
::: zone-end
81
80
82
81
Specifying the visual features you want analyzed in the image determines what information the response will contain. Most responses will contain a bounding box (if a location in the image is reasonable) or a confidence score (for features such as tags or captions).
- content: "Which API would be best for this scenario? You need to read a large number of files with high accuracy. The text is short sections of handwritten text, some in English and some of it is in multiple languages."
21
-
choices:
22
-
- content: "A custom Language API"
23
-
isCorrect: false
24
-
explanation: "Incorrect: Azure AI Language custom models aren't able to perform OCR."
25
-
- content: "Document Intelligence API"
26
-
isCorrect: false
27
-
explanation: "Incorrect: Document Intelligence is the best choice for large amounts of structured text and multiple languages, however isn't the best choice for shorter, unstructured handwritten text."
28
-
- content: "Image Analysis API"
29
-
isCorrect: true
30
-
explanation: "Correct: The Image Analysis service OCR feature is best suited for short sections of handwritten text."
31
-
- content: "What levels of division are the OCR results returned?"
32
-
choices:
33
-
- content: "Only total content and pages of text."
34
-
isCorrect: false
35
-
explanation: "Incorrect: Results contain blocks, words and lines, as well as bounding boxes for each word and line."
36
-
- content: "Blocks, words and lines of text."
37
-
isCorrect: true
38
-
explanation: "Correct: Results contain blocks, words and lines, as well as bounding boxes for each word and line."
39
-
- content: "Total content, image tags, pages, words and lines of text."
40
-
isCorrect: false
41
-
explanation: "Incorrect: Results contain blocks, words and lines, as well as bounding boxes for each word and line."
42
-
- content: "You've scanned a letter into PDF format and need to extract the text it contains. What should you do?"
43
-
choices:
44
-
- content: "Use the Azure AI Custom Vision service"
45
-
isCorrect: false
46
-
explanation: "Incorrect: The Azure AI Custom Vision service is used to build and deploy image identification applications by applying labels to classes or objects."
47
-
- content: "Use the Image Analysis API of the Azure AI Vision service."
48
-
isCorrect: false
49
-
explanation: "Incorrect: The Image Analysis API isn't well suited to process PDF formatted files."
50
-
- content: "Use the Document Intelligence API."
51
-
isCorrect: true
52
-
explanation: "Correct: The Document Intelligence API can be used to process PDF formatted files."
- content: "Which service should you use to locate and read text in signs within a photograph of a street?"
17
+
choices:
18
+
- content: "Azure AI Language"
19
+
isCorrect: false
20
+
explanation: "Incorrect: Azure AI Language isn't able to do OCR."
21
+
- content: "Azure AI Document Intelligence"
22
+
isCorrect: false
23
+
explanation: "Incorrect: Azure Document Intelligence is designed to extract text from documents and forms."
24
+
- content: "Azure AI Vision"
25
+
isCorrect: true
26
+
explanation: "Correct: The Image Analysis feature on Azure AI Vision includes OCR capabilities that can extract text from images."
27
+
- content: "Which visual feature enumeration should you use to return OCR results from an image analysis call?"
28
+
choices:
29
+
- content: "VisualFeatures.Caption"
30
+
isCorrect: false
31
+
explanation: "Incorrect: The VisualFeatures.Caption enumeration returns a suggested caption for the image."
32
+
- content: "VisualFeatures.Read"
33
+
isCorrect: true
34
+
explanation: "Correct: The VisualFeatures.Read enumeration returns text and its location in the image."
35
+
- content: "VisualFeatures.Tags"
36
+
isCorrect: false
37
+
explanation: "Incorrect: The VisualFeatures.Tags enumeration returns suggested tags to help categorize the image."
38
+
- content: "Text location information in an image is returned at which levels by the Azure AI Vision image analysis API?"
39
+
choices:
40
+
- content: "The location of individual *words* only."
41
+
isCorrect: false
42
+
explanation: "Incorrect: The location and text of individual words are returned, but that's not the only level."
43
+
- content: "A single *block* containing all of the text in the image."
44
+
isCorrect: false
45
+
explanation: "Incorrect: A single block is returned, but it includes smaller location areas for the text detected in the image."
46
+
- content: "A *block* containing the location of *lines* of text as well as individual *words*."
47
+
isCorrect: true
48
+
explanation: "Correct: The image analysis OCR results include a block in which each line of text is located, and within each line the location of each word is returned."
Suppose you are given thousands of images and asked to transfer the text on the images to a computer database. The scanned images have text organized in different formats and contain multiple languages. What are some ways you could complete the project in a reasonable time frame and make sure the data is entered with a high degree of accuracy?
1
+
Suppose you're given thousands of images and asked to transfer the text on the images to a computer database. The scanned images have text organized in different formats and contain multiple languages. What are some ways you could complete the project in a reasonable time frame and make sure the data is entered with a high degree of accuracy?
2
2
3
3
Companies around the world are tackling similar scenarios every day. Without AI services, it would be challenging to complete the project, especially if it were to change in scale.
4
4
5
-
Using AI services, we can treat this project as an Azure AI Vision scenario and apply Optical Character Recognition (OCR). OCR allows you to extract text from images, such as photos of street signs and products, as well as from documents — such as handwritten or unstructured documents.
6
-
7
-
To build an automated AI solution, you need to train machine learning models to cover many use cases. Azure AI Vision service gives access to advanced algorithms for processing images and returns data to secure storage.
8
-
9
-
In this module, you'll learn how to:
10
-
11
-
- Identify how the Azure AI Vision service enables you to read text from images
12
-
- Use the Azure AI Vision service with SDKs and the REST API
13
-
- Develop an application that can read printed and handwritten text
5
+
Using AI services, we can treat this project as an Azure AI Vision scenario and apply Optical Character Recognition (OCR). OCR allows you to extract text from images, such as photos of street signs and products, and from documents; such as handwritten or unstructured documents.
Azure AI provides two different features that read text from documents and images, one in the Azure AI Vision Service, the other in Azure AI Document Intelligence. There is overlap in what each service provides, however each is optimized for results depending on what the input is.
2
-
3
-
-**Image Analysis** Optical character recognition (OCR):
4
-
- Use this feature for general, unstructured documents with smaller amount of text, or images that contain text.
5
-
- Results are returned immediately (synchronous) from a single API call.
6
-
- Has functionality for analyzing images past extracting text, including object detection, describing or categorizing an image, generating smart-cropped thumbnails and more.
7
-
- Examples include: street signs, handwritten notes, and store signs.
8
-
-**Document Intelligence**:
9
-
- Use this service to read small to large volumes of text from images and PDF documents.
10
-
- This service uses context and structure of the document to improve accuracy.
11
-
- The initial function call returns an asynchronous operation ID, which must be used in a subsequent call to retrieve the results.
12
-
- Examples include: receipts, articles, and invoices.
13
-
14
-
You can access both technologies via the REST API or a client library. In this module, we'll focus on the OCR feature in **Image Analysis**. If you'd like to learn more about **Document Intelligence**, [reading this module](/training/modules/use-prebuilt-form-recognizer-models/?azure-portal=true) will provide a good introduction.
1
+
There are multiple Azure AI services that read text from documents and images, each optimized for results depending on the input and the specific requirements of your application.
2
+
3
+
## Azure AI Vision
4
+
5
+
Azure AI Vision includes an *image analysis* capability that supports *optical character recognition* (OCR). Consider using Azure AI Vision in the following scenarios:
6
+
7
+
-**Text location and extraction from scanned documents**: Azure AI Vision is a great solution for general, unstructured documents that have been scanned as images. For example, reading text in labels, menus, or business cards.
8
+
-**Finding and reading text in photographs**: Examples include photo's that include street signs and store names.
9
+
-**Digital asset management (DAM)**: Azure AI Vision includes functionality for analyzing images beyond extracting text; including object detection, describing or categorizing an image, generating smart-cropped thumbnails and more. These capabilities make it a useful service when you need to catalog, index, or analyze large volumes of digital image-based content.
10
+
11
+
## Azure AI Document Intelligence
12
+
13
+
Azure AI Document Intelligence is a service that you can use to extract information from complex digital documents. Azure AI Document Intelligence is designed for extracting text, key-value pairs, tables, and structures from documents automatically and accurately. Key considerations for choosing Azure AI Document Intelligence include:
14
+
15
+
-**Form Processing**: Azure AI Document Intelligence is specifically designed to extract data from forms, invoices, receipts, and other structured documents.
16
+
-**Prebuilt Models**: Azure AI Document Intelligence provides prebuilt models for common document types to reduce complexity and integrate into workflows or applications.
17
+
-**Custom Models**: Creating custom models tailored to your specific documents, makes Azure AI Document Intelligence a flexible solution that can be used in many business scenarios.
18
+
19
+
## Azure AI Content Understanding
20
+
21
+
Azure AI Content Understanding is a service that you can use to analyze and extract information from multiple kinds of content; including documents, images, audio streams, and video. It is suitable for:
22
+
23
+
-**Multimodal content extraction**: Extracting content and structured fields from documents, forms, audio, video, and images.
24
+
-**Custom content analysis scenarios**: Support for customizable analyzers enables you to extract specific content or fields tailored to business needs.
25
+
26
+
> [!NOTE]
27
+
> In the rest of this module, we'll focus on the OCR image analysis feature in **Azure AI Vision**. To learn more about Azure AI Document Intelligence and Azure AI Content understanding, consider completing the following training modules:
28
+
>
29
+
> -[Plan an Azure AI Document Intelligence solution](/training/modules/plan-form-recognizer-solution/)
30
+
> -[Analyze content with Azure AI Content Understanding](/training/modules/analyze-content-ai/)
0 commit comments