You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Azure AI Content Understanding classifier overview
2
+
title: Azure AI Content Understanding Classifier Overview
3
3
titleSuffix: Azure AI services
4
4
description: Learn about Azure AI Content Understanding classifier solutions.
5
5
author: PatrickFarley
@@ -16,71 +16,68 @@ ms.custom:
16
16
17
17
> [!IMPORTANT]
18
18
>
19
-
> * The classifier API is only available for documents with the `2025-05-01-preview` release.
20
-
> * Azure AI Content Understanding classifier is available in `2025-05-01-preview` release. Public preview releases provide early access to features that are in active development.
21
-
> * Features, approaches, and processes can change or have limited capabilities, before General Availability (GA).
22
-
> * For more information, *see*[**Supplemental Terms of Use for Microsoft Azure Previews**](https://azure.microsoft.com/support/legal/preview-supplemental-terms).
19
+
> The classifier API is available only for documents with the `2025-05-01-preview` release. The Azure AI Content Understanding classifier is available in the `2025-05-01-preview` release. Public preview releases provide early access to features that are in active development. Features, approaches, and processes can change or have limited capabilities before general availability. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms).
23
20
24
-
Azure AI Content Understanding classifier enables you to detect and identify documents you process within your application. Content Understanding classifier can perform classification of an input file as a whole, or identify multiple documents or multiple instances of a single document within an input file.
21
+
You can use the Azure AI Content Understanding classifier to detect and identify documents that you process within your application. The Content Understanding classifier can perform classification of an input file as a whole. The classifier can also identify multiple documents or multiple instances of a single document within an input file.
25
22
26
23
## Business use cases
27
24
28
-
Classifier can process complex documents in various formats and templates:
29
-
30
-
***Invoices**: Categorize invoices from multiple vendors to process each category with a different Content Understanding analyzer if needed.
31
-
***Tax documents**: Categorize multiple tax documents into different types of tax forms such as 1040, 1099, etc.
32
-
***Contracts**: Long, unstructured contracts can now be categorized to streamline operations to understand different types of agreements and their specific legal implications.
25
+
The classifier can process complex documents in various formats and templates:
33
26
27
+
***Invoices**: Categorize invoices from multiple vendors to process each category with a different Content Understanding analyzer, if needed.
28
+
***Tax documents**: Categorize multiple tax documents into different types of tax forms, such as 1040 and 1099.
29
+
***Contracts**: Categorize long, unstructured contracts to streamline operations to understand different types of agreements and their specific legal implications.
34
30
35
31
## Content Understanding classifier capabilities
36
32
37
-
Content Understanding classifier can analyze a single- or multi-file documents to identify if an input file can be classified into a category as defined. Here are the currently supported scenarios:
38
-
39
-
* A single file containing one document type, such as a loan application form.
40
-
* A single file containing multiple document types. For instance, a loan application package that contains a loan application form, payslip, and bank statement.
41
-
* A single file containing multiple instances of the same document. For instance, a collection of scanned invoices.
42
-
* By default, there's an `$OTHER` class as well, which we utilize for cases where any of the defined categories doesn't seem suitable.
33
+
The Content Understanding classifier can analyze single or multifile documents to identify if an input file can be classified into a category as defined. The following scenarios are supported:
43
34
35
+
* A single file that contains one document type, such as a loan application form.
36
+
* A single file that contains multiple document types. An example is a loan application package that contains a loan application form, pay slip, and bank statement.
37
+
* A single file that contains multiple instances of the same document. An example is a collection of scanned invoices.
38
+
* By default, an `$OTHER` class is used for cases where none of the defined categories seems suitable.
44
39
45
-
### How to use Content Understanding classifier
40
+
### Use the Content Understanding classifier
46
41
47
-
A Content Understanding classifier doesn't require any training dataset. Define up to 50 category name and description and create a classifier. By default, the entire file is treated as a single content object, meaning the file/object is associated to a single category.
42
+
A Content Understanding classifier doesn't require any training dataset. You can define up to 50 category names and descriptions and create a classifier. By default, the entire file is treated as a single content object, which means the file or object is associated to a single category.
48
43
49
-
However, when you have more than one document in a file, the classifier can identify the different document types contained within the input file with splitting capability. The classifier response contains the page ranges for each of the identified document types contained within a file. This response can include multiple instances of the same document type.
44
+
When you have more than one document in a file, the classifier can identify the different document types that are contained within the input file with splitting capability. The classifier response contains the page ranges for each of the identified document types that are contained within a file. This response can include multiple instances of the same document type.
50
45
51
-
When you call the classifier, the `analyze` operation includes a `splitMode` property that gives you granular control over the splitting behavior. You can also specify the page numbers to analyze only certain pages of the input document.
46
+
When you call the classifier, the `analyze` operation includes a `splitMode` property that gives you granular control over the splitting behavior. You can also specify the page numbers to analyze only certain pages of the input document:
52
47
53
-
* To treat the entire input file as a single document for classification set the `splitMode` to `none`. When you do so, the service returns just one category for the entire input file.
54
-
* To classify each page of the input file, set the `splitMode` to `perPage`. The service attempts to classify each page as an individual document.
55
-
*Set the `splitMode` to `auto` and the service identifies the documents and associated page ranges.
48
+
* To treat the entire input file as a single document for classification, set `splitMode` to `none`. When you do so, the service returns one category for the entire input file.
49
+
* To classify each page of the input file, set `splitMode` to `perPage`. The service attempts to classify each page as an individual document.
50
+
*To identify the documents and associated page ranges, set `splitMode` to `auto`.
56
51
57
52
### Optional analysis
58
53
59
-
For a complete end to end flow, you may link classifier categories with existing analyzers. For each content object classified to categories with linked analyzers, the service automatically invokes analysis on the content object using the corresponding analyzer. As an example, this linking can be used to create classifiers that identify and analyze only invoices from a PDF that may contain multiple types of forms in a document.
54
+
For a complete end-to-end flow, you can link classifier categories with existing analyzers. For each content object classified to categories with linked analyzers, the service automatically invokes analysis on the content object by using the corresponding analyzer.
60
55
61
-
* Set the`analyzerId` to an existing analyzer to route and perform field extraction from the classified documents or pages.
56
+
For example, you can use this linking to create classifiers that identify and analyze only invoices from a PDF that contains multiple types of forms in a document. Set`analyzerId` to an existing analyzer to route and perform field extraction from the classified documents or pages.
62
57
63
58
### Classifier limits
64
59
65
-
For information on supported input document formats and classifier limits, refer to our [Service quotas and limits](../service-limits.md#classifier) page.
66
-
60
+
For information on supported input document formats and classifier limits, see [Service quotas and limits](../service-limits.md#classifier).
67
61
68
62
### Best practices
69
63
70
-
To improve classification and splitting quality, it's important to give a good category name and description so the model can understand the categories with some context. For more information on category names and descriptions, *see*[Best practices](../concepts/best-practices.md#classifier-category-names-and-descriptions).
64
+
To improve classification and splitting quality, use a good category name and description so that the model can understand the categories with some context. For more information on category names and descriptions, see [Best practices](../concepts/best-practices.md#classifier-category-names-and-descriptions).
71
65
72
66
## Key benefits
73
67
74
-
***Accuracy and reliability:** Ensure precise document classification, reducing errors and boosting efficiency.
75
-
***Scalability:** Seamlessly scale out document processing to meet business demands.
76
-
***Customizable:** Adapt document classifier to fit specific workflows.
68
+
***Accuracy and reliability**: Ensure precise document classification to reduce errors and boost efficiency.
69
+
***Scalability**: Scale out document processing to meet business demands.
70
+
***Customizable**: Adapt the document classifier to fit specific workflows.
77
71
78
72
## Supported languages and regions
79
-
For a detailed list of supported languages and regions, visit our [Language and region support](../language-region-support.md) page.
73
+
74
+
For a list of supported languages and regions, see [Language and region support](../language-region-support.md).
80
75
81
76
## Data privacy and security
82
-
Developers using Content Understanding should review Microsoft's policies on customer data. For more information, visit our [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy) page.
83
77
84
-
## Next step
85
-
* Try processing your document content using Content Understanding in [Azure AI Foundry](https://aka.ms/cu-landing).
86
-
* Learn to analyze document content [**analyzer templates**](../quickstart/use-ai-foundry.md).
78
+
Developers who use Content Understanding should review Microsoft policies on customer data. For more information, see [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy).
79
+
80
+
## Related content
81
+
82
+
* Try processing your document content by using Content Understanding in [Azure AI Foundry](https://aka.ms/cu-landing).
83
+
* Learn to analyze document content [analyzer templates](../quickstart/use-ai-foundry.md).
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/document/elements.md
+10-13Lines changed: 10 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ Azure AI Content Understanding is available in preview. Public preview releases
20
20
21
21
## Overview
22
22
23
-
The document analysis capabilities in Azure AI Content Understanding help you transform unstructured document data into structured, machine-readable information. You can precisely identify and extract document elements while you preserve their structural relationships. Then you can build powerful document processing workflows for a wide range of applications.
23
+
The document analysis capabilities in Azure AI Content Understanding help you transform unstructured document data into structured, machine-readable information. You can precisely identify and extract document elements while you preserve their structural relationships. Then you can build powerful document processing workflows for a wide range of applications.
24
24
25
25
This article explains the document analysis features that you can use to extract meaningful content from your documents, preserve document structures, and unlock the full potential of your document data.
26
26
@@ -42,12 +42,11 @@ You can extract the following document elements through content extraction:
42
42
*[Tables](#tables)
43
43
*[Sections](#sections)
44
44
45
-
> [!NOTE]
46
-
> Not all content and layout elements are applicable or currently supported by all document file types.
45
+
Not all content and layout elements are applicable or currently supported by all document file types.
47
46
48
47
### Markdown content elements
49
48
50
-
Content Understanding generates richly formatted markdown that preserves the original document's structure. For this reason, large language models can better comprehend document context and hierarchical relationships for AI-powered analysis and generation tasks. In addition to words, selection marks, barcodes, formulas, and images as content, the markdown also includes sections, tables, and page metadata for both visual rendering and machine processing. Learn more about how Content Understanding represents [content and layout elements in markdown](markdown.md).
49
+
Content Understanding generates richly formatted Markdown that preserves the original document's structure. For this reason, large language models can better comprehend document context and hierarchical relationships for AI-powered analysis and generation tasks. In addition to words, selection marks, barcodes, formulas, and images as content, the Markdown also includes sections, tables, and page metadata for both visual rendering and machine processing. Learn more about how Content Understanding represents [content and layout elements in Markdown](markdown.md).
51
50
52
51
#### Words
53
52
@@ -57,15 +56,15 @@ A *word* is a content element composed of a sequence of characters. [Unicode Sta
57
56
58
57
#### Selection marks
59
58
60
-
A *selection mark* is a content element that represents a visual glyph that indicates the state of a selection. Selection marks might appear in the document as checkboxes, check marks, or buttons. You can select or clear a selection mark, with different visual representation to indicate the state. Selection marks are encoded as words in the document analysis result by using `☒` (selected) and `☐` (cleared).
59
+
A *selection mark* is a content element that represents a visual glyph that indicates the state of a selection. Selection marks might appear in the document as checkboxes, check marks, or buttons. You can select or clear a selection mark, with different visual representation to indicate the state. Selection marks are encoded as words in the document analysis result by using the Unicode characters `☒` (selected) and `☐` (cleared).
61
60
62
61
Content Understanding detects check marks inside a table cell as selection marks in the selected state. It doesn't detect empty table cells as selection marks in the cleared state.
63
62
64
63
:::image type="content" source="../media/document/selection-marks.png" alt-text="Screenshot that shows detected selection marks.":::
65
64
66
65
#### Barcodes
67
66
68
-
A *barcode* is a content element that describes both linear (for example, UPC or EAN) and two-dimensional (for example, `QR` or `MaxiCode`) barcodes. Content Understanding represents barcodes by using their detected types and extracted values. The following barcode formats are currently accepted:
67
+
A *barcode* is a content element that describes both linear (for example, UPC or EAN) and two-dimensional (for example, QR or MaxiCode) barcodes. Content Understanding represents barcodes by using their detected types and extracted values. The following barcode formats are currently accepted:
69
68
70
69
* QR Code
71
70
* Code 39
@@ -83,7 +82,7 @@ A *barcode* is a content element that describes both linear (for example, UPC or
83
82
84
83
#### Formulas
85
84
86
-
A *formula* is a content element that represents mathematical expressions in the document. It might be an `inline` formula embedded with other text or a `display` formula that takes up an entire line. Multiline formulas are represented as multiple `display` formula elements grouped into paragraphs to preserve mathematical relationships.
85
+
A *formula* is a content element that represents mathematical expressions in the document. It might be an inline formula embedded with other text or a display formula that takes up an entire line. Multiline formulas are represented as multiple display formula elements grouped into paragraphs to preserve mathematical relationships.
87
86
88
87
#### Images
89
88
@@ -95,10 +94,9 @@ Document *layout elements* are visual and structural components, such as pages,
95
94
96
95
#### Pages
97
96
98
-
A *page* is a grouping of content that typically corresponds to one side of a sheet of paper. A rendered page is characterized via `width` and `height` in the specified `unit`. In general, images use pixels while PDFs use inches. The `angle` property describes the overall text angle in degrees for pages that might be rotated.
97
+
A *page* is a grouping of content that typically corresponds to one side of a sheet of paper. A rendered page is characterized via width and height in the specified unit. In general, images use pixels while PDFs use inches. The `angle` property describes the overall text angle in degrees for pages that might be rotated.
99
98
100
-
> [!NOTE]
101
-
> For spreadsheets like Excel, each sheet is mapped to a page. For presentations, like PowerPoint, each slide is mapped to a page. For file formats like HTML or Word documents, which lack a native page concept without rendering, the entire main content is treated as a single page.
99
+
For spreadsheets like Excel, each sheet is mapped to a page. For presentations, like PowerPoint, each slide is mapped to a page. For file formats like HTML or Word documents, which lack a native page concept without rendering, the entire main content is treated as a single page.
102
100
103
101
#### Paragraphs
104
102
@@ -124,8 +122,7 @@ A table caption specifies content that explains the table. A table can also have
124
122
125
123
A table might span across consecutive pages of a document. In this situation, table continuations in subsequent pages generally maintain the same column count, width, and styling. They often repeat the column headers. Typically, no intervening content comes between the initial table and its continuations except for page headers, footers, and page numbers.
126
124
127
-
> [!NOTE]
128
-
> The span for tables covers only the core content and excludes associated caption and footnotes.
125
+
The span for tables covers only the core content and excludes associated captions and footnotes.
129
126
130
127
:::image type="content" source="../media/document/table.png" alt-text="Screenshot that shows a table by using the layout feature.":::
131
128
@@ -160,4 +157,4 @@ Page numbers are 1-indexed. The bounding polygon describes a sequence of points
160
157
* Try processing your document content by using Content Understanding in [Azure AI Foundry](https://aka.ms/cu-landing).
161
158
* Learn to analyze document content [analyzer templates](../quickstart/use-ai-foundry.md).
162
159
* Review code samples with [visual document search](https://github.com/Azure-Samples/azure-ai-search-with-content-understanding-python/blob/main/notebooks/search_with_visual_document.ipynb).
0 commit comments