Skip to content

Commit a3ddeec

Browse files
authored
Merge pull request #1317 from laujan/5-content-understanding
5 content understanding
2 parents 1a38470 + 7acd380 commit a3ddeec

File tree

7 files changed

+100
-28
lines changed

7 files changed

+100
-28
lines changed

articles/ai-services/content-understanding/faq.yml

Lines changed: 65 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,74 @@ metadata:
1111
title: Frequently asked questions
1212
summary: |
1313
14-
Azure AI Content Understanding is a cloud-based data solution designed to process both structured and unstructured content across various modalities, including documents, images, videos, and audio.
15-
14+
Find answers to commonly asked questions about Azure AI Content Understanding
1615
sections:
1716
- name: Overview
1817
questions:
1918
- question: |
20-
Can I continue to use Document Intelligence v4.0 capabilities?
19+
What is Content Understanding
20+
answer: |
21+
Content Understanding is a new Azure AI Service designed to generate structured insights from unstructured content using artificial intelligence. It provides consistent experience to extract content or a structured schema from audio, video, images, documents, or text inputs.
22+
- question: |
23+
How does Content Understanding work?
24+
answer: |
25+
Content Understanding utilizes Generative AI models to analyze and interpret various forms of unstructured content. It integrates data from different modalities (for example, text, images, audio) to generate a cohesive and structured output. The service uses machine learning models trained on diverse datasets and generative AI models to ensure high accuracy and relevance in the insights provided.
26+
- question: |
27+
What types of unstructured content can Content Understanding process?
28+
answer: |
29+
Content Understanding can process a wide range of unstructured content, including but not limited to:
30+
* Audio recordings
31+
* Video content
32+
* Documents
33+
* Text content
34+
* Images
35+
- question: |
36+
What are the key benefits of using Content Understanding?
37+
answer: |
38+
The key benefits of using Content Understanding include:
39+
* Confidence scores: Ensure the accuracy of extracted values while minimizing the cost of human review.
40+
* Defined schema: Define a schema to ensure the extracted values align with intended use.
41+
* Quality improvements over time: The service provides capabilities to improve the quality of the schema extracted.
42+
* Improved decision-making: Structured insights help organizations make informed decisions quickly and effectively.
43+
* Increased efficiency: Automating the analysis of unstructured content saves time and reduces the manual effort required.
44+
* Scalability: The service can handle large volumes of data, making it suitable for organizations of all sizes.
45+
- question: |
46+
How can businesses use Content Understanding?
47+
answer: |
48+
Businesses can use Content Understanding in various ways, such as:
49+
* Automation: Automate processing of content to extract a defined schema. Call center, documents, and other similar scenarios.
50+
* Content cataloging: managing a large corpus of digital assets.
51+
* Customer sentiment analysis: Understanding customer feedback from reviews, social media, and support interactions.
52+
* Market research: Analyzing trends and patterns from diverse data sources to inform business strategies.
53+
* Operational insights: Gain insights from internal documents, emails, and other unstructured data to improve operations.
54+
- question: |
55+
Is Content Understanding easy to integrate with existing systems?
56+
answer: |
57+
Yes, Content Understanding easily integrates with existing systems and workflows. The service offers a set of easy-to-use APIs that can be integrated into any application.
58+
- question: |
59+
What security measures are in place to protect data processed by Content Understanding?
60+
answer: |
61+
Azure AI Services, including Content Understanding, adheres to strict security and compliance standards to ensure data protection. These measures include data encryption, secure access controls, and compliance with industry regulations such as GDPR and HIPAA. The service also adheres to Microsoft’s responsible use of AI.
62+
- question: |
63+
How do the capabilities of Azure AI Content Understanding compare to Document Intelligence
64+
answer: |
65+
Azure AI Content Understanding and Document Intelligence are both powerful tools, but they serve different purposes and have distinct capabilities.
66+
Azure AI Content Understanding integrates various data types like text, images, videos, and audio, providing comprehensive analysis and insights using Generative AI. The service is ideal for applications needing diverse data integration for automation, Search, and Retrieval-Augmented Generation (RAG), analytics, and reporting.
67+
Conversely, Document Intelligence focuses on extracting and processing key data from documents, such as invoices, forms, and contracts, converting unstructured data into structured, usable information.
68+
- question: |
69+
How can I migrate from Document Intelligence to Azure AI Content Understanding?
70+
answer: |
71+
Currently, migration from Document Intelligence to Content Understanding is unavailable.
72+
- question: |
73+
What base models does Azure AI Content Understanding use?
74+
answer: |
75+
Content Understanding uses various models and capabilities from Azure OpenAI, Azure AI Speech, Vision, and Language to support single- modality and multi-modal scenarios. The service determines the selection of base models appropriate for each scenario.
76+
- question: |
77+
What are the pricing tier options for Content Understanding
78+
answer: |
79+
Content Understanding only supports Standard S0 pricing tier. See more details on the pricing page.
80+
- question: |
81+
How can I get started with Content Understanding?
2182
answer: |
22-
**Yes.**
83+
To get started with Content Understanding, visit the Azure AI Studio and get started with Content Understanding. Azure AI Studio provides comprehensive guides, tutorials, and customer support to help you set up and utilize Content Understanding effectively.
2384
24-
Current users of Document Intelligence can continue using the service during the preview development phase of the Multimodal service. whats-new.md Document Intelligence v4.0 becomes generally available (GA), its features are integrated with the Content Understanding service. Future enhancements related to document scenarios are then accessible via the Content Understanding service. Existing customers can transition to Content Understanding with minimal disruption.

articles/ai-services/content-understanding/image/overview.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,13 @@ Content Understanding supports the following image file formats in preview:
6565
| **array**| √ List of subfields of the same type||
6666
| **Object**| √ Named list of subfields of potentially different types. ||
6767

68+
## Data privacy and security
69+
70+
As with all the Azure AI services, developers using the Content Understanding service should be aware of Microsoft's policies on customer data. See our [**Data, protection and privacy**](https://www.microsoft.com/trust-center/privacy) page to learn more.
71+
72+
> [!IMPORTANT]
73+
> If you are using Microsoft products or services to process Biometric Data, you are responsible for: (i) providing notice to data subjects, including with respect to retention periods and destruction; (ii) obtaining consent from data subjects; and (iii) deleting the Biometric Data, all as appropriate and required under applicable Data Protection Requirements. "Biometric Data" will have the meaning set forth in Article 4 of the GDPR and, if applicable, equivalent terms in other data protection requirements. For related information, see [Data and Privacy for Face](/legal/cognitive-services/face/data-privacy-security).
74+
6875
## Next steps
6976

7077
Try processing your content and data using Content Understanding in the [Azure AI Studio](https://ai.azure.com/?tid=888d76fa-54b2-4ced-8ee5-aac1585adee7).
221 KB
Loading

articles/ai-services/content-understanding/overview.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,28 @@ ms.custom: ignite-2024-understanding-release
1515

1616
Azure AI Content Understanding is a cloud-based solution within [**Azure AI services**](../what-are-ai-services.md), designed to process/ingest various data modalities such as documents, images, videos, and audio into customizable output formats using Generative AI, Larger Language models (LLM), and Small Language Models (SLM) within a unified workflow.
1717

18-
Built on the success of Document Intelligence, Content Understanding offers a streamlined process to reason over large amounts of unstructured data, build customizable workflows, ultimately accelerating time-to-value (TTV), while varied AI models.
18+
Content Understanding offers a streamlined process to reason over large amounts of unstructured data, build customizable workflows, ultimately accelerating time-to-value (TTV), while varied AI models.
1919

20-
:::image type="content" source="media/overview/content-understanding-process.png" lightbox="media/overview/content-understanding-process.png" alt-text="Screenshot of accepted media input files.":::
20+
:::image type="content" source="media/overview/content-understanding-overview.png" lightbox="media/overview/content-understanding-process.png" alt-text="Screenshot of accepted media input files.":::
2121

22-
### Benefits of using Content Understanding
22+
### Why use Content Understanding?
2323

24-
* **Simplified and streamlined workflows**. Content Understanding simplifies data extraction from mixed modality and unstructured content by eliminating the need for separate workflows.
24+
* **Simplified and streamlined workflows**. Content Understanding unifies the process for extracting data from any modality or combination of modalities, creating a unified approach to processing all types of content.
2525

26-
:::image type="content" source="media/overview/content-understanding-workflow.png" alt-text="Screenshot comparing Content Understanding workflows.":::
26+
* **Simplified Content Extraction**. Content Understanding's schema definition streamlines the generation of structured output from various content types. Users are enabled to define schemas where fields can be extracted, inferred, or abstracted without requiring complex prompt engineering.
2727

2828
* **Efficiency and Cost Reduction**. Automating the ingestion and analysis of large amounts of data from varied sources reduces the cost associated with building Generative AI automation solutions.
2929

3030
* **Enhanced Accuracy**. Content Understanding uses multiple data modalities to simultaneously analyze and cross-validate information, leading to more accurate and reliable results.
3131

3232
### Content Understanding use cases
3333

34+
* **Automation**. Content Understanding can significantly enhance automation by transforming unstructured content into structured data, which can then be seamlessly integrated into various downstream workflows and applications. For example, it can automate procurement and payment processes by extracting fields from invoices.
35+
36+
* **Search and Retrieval Augmented Generation**. Content Understanding enhances Search and Retrieval-Augmented Generation (RAG) by processing diverse unstructured content. The output can be added to a search index and RAG applications, enhancing the search experience with more accurate and relevant results.
37+
38+
* **Analytics and Reporting**: Content Understanding's extracted schema outputs enhance analytics and reporting, allowing businesses to gain valuable insights, conduct deeper analysis and make informed decisions from accurate reports.
39+
3440
* **Business leaders and c-suite executives**. Decision makers gain actionable insights from Content Understanding solutions. Generative
3541
AI powered results and high confidence scores lead to enlightened data-driven decisions and minimize the need for human review.
3642

@@ -66,7 +72,7 @@ At Microsoft, we prioritize advancing AI with a people-first approach. Generativ
6672
As with all the Azure AI services, developers using the Content Understanding service should be aware of Microsoft's policies on customer data. See our [**Data, protection and privacy**](https://www.microsoft.com/trust-center/privacy) page to learn more.
6773

6874
> [!IMPORTANT]
69-
> if you are using Microsoft products or services to process Biometric Data, you are responsible for: (i) providing notice to data subjects, including with respect to retention periods and destruction; (ii) obtaining consent from data subjects; and (iii) deleting the Biometric Data, all as appropriate and required under applicable Data Protection Requirements. "Biometric Data" will have the meaning set forth in Article 4 of the GDPR and, if applicable, equivalent terms in other data protection requirements. For related information, see [Data and Privacy for Face](/legal/cognitive-services/face/data-privacy-security).
75+
> If you are using Microsoft products or services to process Biometric Data, you are responsible for: (i) providing notice to data subjects, including with respect to retention periods and destruction; (ii) obtaining consent from data subjects; and (iii) deleting the Biometric Data, all as appropriate and required under applicable Data Protection Requirements. "Biometric Data" will have the meaning set forth in Article 4 of the GDPR and, if applicable, equivalent terms in other data protection requirements. For related information, see [Data and Privacy for Face](/legal/cognitive-services/face/data-privacy-security).
7076
7177
## Getting started
7278
Before you get started using Content Understanding, you need an [**Azure AI services multi-service resource**](how-to/create-multi-service-resource.md). The multi-service resource enables access to multiple Azure AI services with a single set of credentials.

articles/ai-services/content-understanding/service-limits.md

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: Service quotas and limits - Multimodal Intelligence
2+
title: Service quotas and limits - Content Understanding
33
titleSuffix: Azure AI services
4-
description: Quick reference, detailed description, and best practices for working within Azure AI Multimodal Intelligence service Quotas and Limits
4+
description: Quick reference, detailed description, and best practices for working within Azure AI Content Understanding service Quotas and Limits
55
#services: cognitive-services
66
author: laujan
77
manager: nitinme
@@ -15,7 +15,7 @@ ms.author: lajanuar
1515

1616
# Service limits and quotas
1717

18-
This article provides both a quick reference and detailed description of Azure AI Multimodal Intelligence service quotas and limits.
18+
This article provides both a quick reference and detailed description of Azure AI Content Understanding service quotas and limits.
1919

2020
## File limits
2121

@@ -38,18 +38,18 @@ Each modality covers a set of Multipurpose Internet Mail Extensions (MIME) file
3838

3939
|Modality| Supported File Types | File Size | Resolution | Length |
4040
|--- | --- | --- | --- | --- |
41-
|**Audio** | √ .wav (PCM, ALAW, MULAW) </br>√ .mp3 </br>√.opus, .ogg (Opus)</br>√.flac </br>√ .wma </br>√ .aac </br>√ .amr (AMR-NB, AMR-WB) </br>√.webm (Opus, Vorbis) </br>√ .m4a (AAC, ALAC)</br>√.spx | asynchronous:</br>≤ 200 MB | | asynchronous:</br> ≤ 2 h |
41+
|**Audio** | √ .wav (`PCM`, `ALAW`, M`ULAW`) </br>√ .mp3 </br>√.opus, .ogg (Opus)</br>√.flac </br>√ .wma </br>√ .aac </br>√ .amr (AMR-NB, AMR-WB) </br>√.webm (Opus, Vorbis) </br>√ .m4a (`AAC`, `ALAC`)</br>√.spx | asynchronous:</br>≤ 200 MB | | asynchronous:</br> ≤ 2 h |
4242

4343
### Video
4444

4545
|Modality| Supported File Types | File Size | Resolution | Length |
4646
|--- | --- | --- | --- | --- |
47-
|**Video** | √ .mp4, .m4v </br>√ .flv (with H.264 and AAC codecs) </br>√ .wmv, .asf </br>√ .avi (Uncompressed 8bit/10bit) </br>√ .mkv </br>√ .mov | asynchronous:</br>≤2 GB (body) asynchronous:</br>≤20 GB (URL)| Min:</br>320 x 240</br></br>Max:</br>1920 x 1080 | asynchronous:</br>≤30 m (body)</br></br> asynchronous:</br>≤30 m (URL) |
47+
|**Video** | √ .mp4, .m4v </br>√ .flv (with H.264 and `AAC` codecs) </br>√ .wmv, .asf </br>√ .avi (Uncompressed 8bit/10bit) </br>√ .mkv </br>√ .mov | asynchronous:</br>≤2 GB (body) asynchronous:</br>≤20 GB (URL)| Min: 320 x 240</br></br>Max:</br>1920 x 1080 | asynchronous:</br>≤30 m (body)</br></br> asynchronous:</br>≤30 m (URL) |
4848

4949

5050
## Field Schema Limits
5151

52-
A schema in Multimodal Intelligence refers to a defined structure specifying the types of data to be extracted from various types of unstructured content. Unstructured content types include documents, images, videos, and audio. This structured representation of data is crucial for enabling downstream applications to process and analyze the extracted information effectively.
52+
A schema in Content Understanding refers to a defined structure specifying the types of data to be extracted from various types of unstructured content. Unstructured content types include documents, images, videos, and audio. This structured representation of data is crucial for enabling downstream applications to process and analyze the extracted information effectively.
5353

5454
This section details the limits of the field inputs for schema definition.
5555

@@ -64,17 +64,15 @@ This section details the limits of the field inputs for schema definition.
6464
| **array**| √ List of subfields of the same type||
6565
| **Object**| √ Named list of subfields of potentially different types. | 10 (audio, image, video), 50 (document) |
6666

67-
## Analyzer limits per resource
68-
69-
Analyzers in Multimodal Intelligence are specialized components designed to process and extract structured data from various types of unstructured content, such as textual documents, audio, images, and video. These analyzers are tailored to handle specific types of data and tasks, ensuring that the extracted information is accurate and useful for downstream applications.
70-
67+
## Training limits for Custom Document
7168
| Quota | Standard (S0) |
7269
| --- | --- |
73-
| Max models | 100k |
74-
| Max analysis/min | 1000 pages/images four, (4) hours of audio, 1 hour of video |
75-
| Max operations/min | 3000 |
76-
| Free trainings / month | 10 hours |
7770
| Max training file size | 1 GB |
7871
| Max training length | 50k pages/images |
79-
| Max fields | 100 (document), 10(image, audio, video) |
80-
| Max enum values | 300 per schema |
72+
73+
## Resource limits
74+
| Quota | Standard (S0) |
75+
| --- | --- |
76+
| Max analyzers | 100k |
77+
| Max analysis/min | 1000 pages/images four, (4) hours of audio, 1 hour of video |
78+
| Max operations/min | 3000 |

articles/ai-services/content-understanding/toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ items:
3131
href: audio/overview.md
3232
- name: Video
3333
displayName: video, audio, voice, recognition, synthesis, speaker, identification, verification, diarization, transcription, translation, language, understanding, sentiment, analysis, emotion, detection, pronunciation, model
34+
href: video/overview.md
3435
- name: Image
3536
displayName: image, OCR, optical character recognition, text, extraction, analysis, detection, recognition, model
3637
href: image/overview.md

0 commit comments

Comments
 (0)