You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/concepts/prebuilt-analyzers.md
+28-69Lines changed: 28 additions & 69 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,92 +12,51 @@ ms.date: 05/19/2025
12
12
13
13
# Prebuilt analyzers in Azure AI Content Understanding
14
14
15
-
Azure AI Content Understanding prebuilt analyzers are ready-to-use solutions designed to streamline standard content processing taskssuch as document ingestion, search indexing, and retrieval-augmented generation (`RAG`). Analyzers extract structured insights from unstructured content, including documents, images, audio, and video files. They also allow users to define custom settings for content extraction and specify field extraction schemas. Once configured, an analyzer applies these settings consistently to process all incoming data in a systematically.
15
+
Azure AI Content Understanding prebuilt analyzers are ready-to-use tools designed to streamline common content processing tasks. They support scenarios such as content ingestion for search and retrieval-augmented generation (RAG) workflows, and intelligent document processing (IDP) for extracting data from invoices or analyzing call center recordings. You can also [customize these analyzers](../tutorial/create-custom-analyzer.md) to extract more fields or refine outputs to better fit your specific workflow requirements.
16
16
17
-
Analyzers enhance trial processes, offering streamlined experiences and the flexibility to be tailored by extending their functionalities to suit unique workflow needs. Key features include:
17
+
## Prebuilt analyzers for content ingestion
18
18
19
-
***[Content parsers](#content-parsers-for-search-and-ingestion)** for general search and ingestion scenarios.
20
-
***[Scenario-specific predefined analyzers](#scenario-specific-predefined-analyzers)** for targeted use cases like invoices or call center transcripts.
21
-
***[Inheritance from prebuilt analyzers](#inheritance-and-customizing-prebuilt-analyzers)** to customize configuration and fields.
19
+
Azure AI Content Understanding offers prebuilt analyzers that extract raw content with layout as markdown and perform essential semantic analysis, simplifying common content ingestion tasks. These capabilities enhance retrieval quality for downstream applications such as retrieval-augmented generation (RAG).
22
20
23
-
##Content parsers for search and ingestion
21
+
##### `prebuilt-documentAnalyzer`
24
22
25
-
To streamline common content ingestion scenarios, Azure AI Content Understanding offers general purpose **prebuilt content analyzers**. These analyzers extract text, layout, and metadata from various content types.
23
+
* Extracts text and layout details from documents and images.
24
+
* Produces a concise summary of the document content.
|`prebuilt-documentAnalyzer`| Extracts text, layout, and metadata using `OCR` for images and rendered files. Users can customize prebuilt content analyzers to modify configuration and add/remove fields. |`.pdf`, `.tiff`, `image`, `.docx`, `.rtf`, `.html`, `.md`, `.json`, `.xml`, `.csv`, `.tsv`, and `.txt`|
31
-
|`prebuilt-imageAnalyzer`| Generates a descriptive caption of an image and `OCR` is conceptually disabled. Users refine the description and/or add new fields by creating analyzer with baseAnalyzerId=prebuilt-imageAnalyzer. | image |
32
-
|`prebuilt-audioAnalyzer`| Produces a transcript, speaker diarization, and a summary for audio files. Users can add new fields by creating analyzer with baseAnalyzerId=prebuilt-audioAnalyzer. | audio |
33
-
|`prebuilt-videoAnalyzer`| Extracts keyframes, transcript, and video segmentation. Segmentation is enabled by default. Users can disable/customize segmentation by creating an analyzer with baseAnalyzerId=prebuilt-videoAnalyzer and changing segmentationMode property. | video |
28
+
* Generates a descriptive caption for the image.
34
29
35
-
Analyzers are optimized for `RAG` ingestion and search workflows, offering default behaviors suitable for indexing and summarizing large volumes of content.
30
+
##### `prebuilt-audioAnalyzer`
36
31
37
-
> [!NOTE]
38
-
>
39
-
> * Currently, `OCR` is supported for `.pdf` and `.tiff` image files. Content elements from such files include span properties and bounding boxes via their source properties.
40
-
> * For unsupported files, contents are extracted digitally. Content elements from these files include span properties to indicate their position in the returned markdown.
41
-
> * There are no prebuilt models for `agentic` mode. Instead, users can create an analyzer with mode=pro starting from any document base analyzer to test out `agentic` behavior.
32
+
* Extracts transcripts from audio files.
33
+
* Performs speaker diarization to distinguish among different speakers.
34
+
* Provides a summary of the audio content.
42
35
43
-
##Scenario-specific predefined analyzers
36
+
##### `prebuilt-videoAnalyzer`
44
37
45
-
In addition to general content analyzers, Azure AI Content Understanding provides **prebuilt analyzers for specific business scenarios** to target common scenarios. They can be further customized by setting them as the `baseAnalyzerId`:
38
+
* Extracts transcripts from video files.
39
+
* Identifies keyframes and camera shots.
40
+
* Divides/segments the video into meaningful sections.
|`prebuilt-callCenter`| Extracts summary, sentiment, topics, and insights from call center transcripts. | audio |
50
-
|`prebuilt-invoice`| Extracts structured fields such as InvoiceId, Date, and Vendor from invoices. |`.pdf`, `.tiff`, and `image` files.|
51
43
52
-
These analyzers bundle best practices and hidden configurations to deliver accurate extractions for their intended use cases while simplifying deployment by abstracting internal implementation details.
44
+
## Prebuilt analyzers for intelligent document processing
53
45
46
+
Content Understanding also includes prebuilt analyzers designed for specialized industry scenarios, enabling extraction of structured data from invoices and analysis of call center transcripts.
54
47
55
-
##Inheritance and customizing prebuilt analyzers
48
+
##### `prebuilt-invoice`
56
49
57
-
With the **`2025-05-01-preview`**, any prebuilt analyzer can be inherited using `baseAnalyzerId` to create a custom analyzer. Inheritance allows for modification of existing fields, descriptions, types, and methods. Additionally, configuration settings such as `enableFormula`, `segmentationMode`, and others can be customized.
50
+
* Extracts text and document layout as markdown from documents and images.
51
+
* Extracts structured data from invoices, including invoice number, date, vendor, total amount, and line items. Supports various invoice formats and languages, enabling automated data capture for accounts payable processes and related scenarios.
> With the `2025-05-01-preview`, modifying a field description overwrites the internal refined description, potentially reducing extraction quality.
80
-
> The `baseAnalyzerId` must be a prebuilt analyzer. Custom analyzers can't currently inherit from other custom analyzers.
81
-
82
-
## Analyzer details and configurations
83
-
84
-
***Document Analyzer**: Uses `OCR` for `.pdf`,`.tiff`, and `image` files.
85
-
***Image Analyzer**: Doesn't use `OCR` but generates image descriptions.
86
-
***Audio Analyzer**: Returns transcript and summary extraction.
87
-
***Video Analyzer**: Returns keyframes, transcript, and segmentation.
88
-
***Call Center Analyzer**: Summarizes and extracts insights from audio. Supports audio text.
89
-
***Invoice Analyzer**: Returns structured field extraction from invoices. Supports `.pdf`, `.tiff`, and `image` files.
90
-
91
-
92
-
## Billing and limits
93
-
94
-
***Documents**: Billing is calculated per page, slide, or sheet. For`.docx`, `.rtf`, `.html`, `.md`, `.msg`, `.eml`, `.json`, `.xml`, `.csv`, `.tsv`, and `.txt`, we count every 3k `UTF16 `characters as a page. Field extraction has a `fixed-per-1k` page rate
95
-
***Images**: There's no cost for image content extraction, however, generating a description invokes image field extraction charges.
96
-
***Audio/Video**: Billing is calculated on a per hour basis with 1-minute granularity. Charges are calculated for both audio/video content extraction and field extraction.
97
-
* Maximum field limit: Currently there are 90 user-defined fields with 100 total to include reserved fields.
55
+
* Extracts transcripts from audio files.
56
+
* Distinguishes between speakers and assigns them to customer or agent roles.
57
+
* Analyzes call center transcripts to generate summaries, determine customer sentiment, identify discussion topics, and more.
98
58
99
59
## Next steps
100
60
101
-
*[Analyzer templates](analyzer-templates.md)
102
-
103
-
61
+
*[Try out prebuilt analyzers using REST API](../quickstart/use-rest-api.md).
0 commit comments