Skip to content

Commit 7c9c609

Browse files
authored
Merge pull request #1583 from MicrosoftDocs/release-ignite-multimodal-intelligence-preview
Release ignite multimodal intelligence preview -> main -- 11/19 10:00 AM PST
2 parents 1dddf35 + eb1d65b commit 7c9c609

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+1570
-7
lines changed
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
title: Azure AI Content Understanding audio overview
3+
titleSuffix: Azure AI services
4+
description: Learn about Azure AI Content Understanding audio solutions
5+
author: laujan
6+
ms.author: lajanuar
7+
manager: nitinme
8+
ms.service: azure
9+
ms.topic: overview
10+
ms.date: 11/19/2024
11+
ms.custom: ignite-2024-understanding-release
12+
---
13+
14+
15+
# Content Understanding audio solutions (preview)
16+
17+
> [!IMPORTANT]
18+
>
19+
> * Azure AI Content Understanding is available in preview. Public preview releases provide early access to features that are in active development.
20+
> * Features, approaches, and processes may change or have constrained capabilities, prior to General Availability (GA).
21+
> * For more information, *see* [**Supplemental Terms of Use for Microsoft Azure Previews**](https://azure.microsoft.com/support/legal/preview-supplemental-terms).
22+
23+
Content Understanding audio analyzers enable transcription and diarization of conversational audio, extracting structured fields such as summaries, sentiments, and key topics. Customize an audio analyzer template to your business needs using [Azure AI Foundry](https://ai.azure.com/) to start generating results.
24+
25+
Here are common scenarios for using Content Understanding with conversational audio data:
26+
27+
* Gain customer insights through summarization and sentiment analysis.
28+
* Assess and verify call quality and compliance in call centers.
29+
* Create automated summaries and metadata for podcast publishing.
30+
31+
## Audio analyzer capabilities
32+
33+
:::image type="content" source="../media/audio/overview/workflow-diagram.png" lightbox="../media/audio/overview/workflow-diagram.png" alt-text="Illustration of Content Understanding audio workflow.":::
34+
35+
Content Understanding serves as a cornerstone for Media Asset Management solutions, enabling the following capabilities for audio files:
36+
37+
### Content extraction
38+
39+
* **Transcription**. Converts conversational audio into searchable and analyzable text-based transcripts in WebVTT format. Customizable fields can be generated from transcription data. Sentence-level and word-level timestamps are available upon request.
40+
41+
* **`Diarization`**. Distinguishes between speakers in a conversation, attributing parts of the transcript to specific speakers.
42+
43+
* **Speaker role detection**. Identifies agent and customer roles within contact center call data.
44+
45+
* **Language detection**. Automatically detects the language in the audio or uses specified language/locale hints.
46+
47+
### Field extraction
48+
49+
Field extraction allows you to extract structured data from audio files, such as summaries, sentiments, and mentioned entities from call logs. You can begin by customizing a suggested analyzer template or creating one from scratch.
50+
51+
## Key Benefits
52+
Content Understanding offers advanced audio capabilities, including:
53+
54+
* **Customizable data extraction**. Tailor the output to your specific needs by modifying the field schema, allowing for precise data generation and extraction.
55+
56+
* **Generative models**. Utilize generative AI models to specify in natural language the content you want to extract, and the service generates the desired output.
57+
58+
* **Integrated pre-processing**. Benefit from built-in preprocessing steps like transcription, diarization, and role detection, providing rich context for generative models.
59+
60+
* **Scenario adaptability**. Adapt the service to your requirements by generating custom fields and extract relevant data.
61+
62+
## Content Understanding audio analyzer templates
63+
64+
Content Understanding offers customizable audio analyzer templates:
65+
66+
* **Post-call analytics**. Analyze call recordings to generate conversation transcripts, call summaries, sentiment assessments, and more.
67+
68+
* **Conversation summarization**. Generate transcriptions, summaries, and sentiment assessments from conversation audio recordings.
69+
70+
Start with a template or create a custom analyzer to meet your specific business needs.
71+
72+
## Input requirements
73+
For a detailed list of supported audio formats, refer to our [Service limits and codecs](../service-limits.md) page.
74+
75+
## Supported languages and regions
76+
77+
For a complete list of supported regions, languages, and locales, see our [Language and region support](../language-region-support.md)) page.
78+
79+
## Data privacy and security
80+
81+
Developers using Content Understanding should review Microsoft's policies on customer data. For more information, visit our [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy) page.
82+
83+
## Next steps
84+
85+
* Try processing your audio content using Content Understanding in [Azure AI Foundry](https://ai.azure.com/).
86+
* Learn more about audio [**analyzer templates**](../quickstart/use-ai-foundry.md).
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
---
2+
title: Azure AI Content Understanding document overview
3+
titleSuffix: Azure AI services
4+
description: Learn about Azure AI Content Understanding document solutions.
5+
author: laujan
6+
ms.author: lajanuar
7+
manager: nitinme
8+
ms.service: azure
9+
ms.topic: overview
10+
ms.date: 11/19/2024
11+
ms.custom: ignite-2024-understanding-release
12+
---
13+
14+
# Content Understanding document solutions (preview)
15+
16+
> [!IMPORTANT]
17+
>
18+
> * Azure AI Content Understanding is available in preview. Public preview releases provide early access to features that are in active development.
19+
> * Features, approaches, and processes may change or have constrained capabilities, prior to General Availability (GA).
20+
> * For more information, *see* [**Supplemental Terms of Use for Microsoft Azure Previews**](https://azure.microsoft.com/support/legal/preview-supplemental-terms).
21+
22+
Content Understanding is a cloud-based [Azure AI Service](../../what-are-ai-services.md) designed to efficiently extract content and structured fields from documents and forms. It provides a comprehensive suite of APIs and an intuitive UX experience for optimal efficiency.
23+
24+
Content Understanding enables organization to streamline data collection and processing, enhance operational efficiency, optimize data-driven decision making, and empower innovation. With customizable analyzers, Content Understanding allows for easy extraction of content or fields from documents and forms, tailored to specific business needs.
25+
26+
## Business use cases
27+
28+
Document analyzers can process complex documents in various formats and templates:
29+
30+
* **Contract lifecycle management**: Extract key fields, clauses, and obligations from various contract types.
31+
* **Loan and mortgage applications**: Automate processing to enable quicker handling by banks, lenders, and government entities.
32+
* **Financial services**: Analyze complex documents like financial reports and asset management reports.
33+
* **Expense management**: Parse receipts and invoices from various retailers to validate expenses across different formats and templates.
34+
35+
36+
## Document analyzer capabilities
37+
38+
:::image type="content" source="../media/document/extraction-overview.png" alt-text="Screenshot of document extraction flow.":::
39+
40+
Content extraction enables the extraction of both printed and handwritten text from forms and documents, delivering business-ready content that is immediately actionable, usable, or adaptable for further development within your organization.
41+
42+
### Add-on capabilities
43+
44+
Enhance your document extraction with optional add-on features, which can incur added costs. These features can be enabled or disabled based on your needs. Currently supported add-ons include:
45+
46+
* **Layout**: Extracts layout information such as paragraphs, sections, tables, and more.
47+
* **Barcode**: Identifies and decodes all barcodes in the documents.
48+
* **Formula**: Recognizes all identified mathematical equations from the documents.
49+
50+
51+
### Field extraction
52+
53+
Field extraction enables the extraction of structured data from various forms and documents tailored to your specific needs. For instance, you can extract customer names, billing addresses, and line items from invoices; or parties, renewal date, and payment clause from contracts. You can start field extraction right after defining the schema or enhance it by labeling more sample documents to improve extraction quality.
54+
55+
## Key Benefits
56+
57+
* **Accuracy and reliability:** Ensure precise data extraction, reducing errors and boosting efficiency.
58+
* **Scalability:** Seamlessly scale out document processing to meet business demands.
59+
* **Customizable:** Adapt document analyzer to fit specific workflows.
60+
* **Grounding source:** Localize extracted data for human review workflows.
61+
* **Confidence scores:** Enhance automation with estimated confidence scores to maximize efficiency and minimize costs.
62+
63+
## Input requirements
64+
For detailed information on supported input document formats, refer to our [Service quotas and limits](../service-limits.md) page.
65+
66+
## Supported languages and regions
67+
For a detailed list of supported languages and regions, visit our [Language and region support](../language-region-support.md) page.
68+
69+
## Data privacy and security
70+
Developers using Content Understanding should review Microsoft's policies on customer data. For more information, visit our [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy) page.
71+
72+
## Next step
73+
* Try processing your document content using Content Understanding in [Azure AI Foundry](https://ai.azure.com/).
74+
* Learn more about document [**analyzer templates**](../quickstart/use-ai-foundry.md).
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
### YamlMime:FAQ
2+
metadata:
3+
title: Azure AI Content Understanding FAQ
4+
description: Get answers to frequently asked questions about the Document Intelligence service.
5+
author: laujan
6+
manager: nitinme
7+
ms.service: azure
8+
ms.topic: faq
9+
ms.date: 11/19/2024
10+
ms.author: lajanuar
11+
title: Frequently asked questions
12+
summary: |
13+
14+
Find answers to commonly asked questions about Azure AI Content Understanding
15+
sections:
16+
- name: Overview
17+
questions:
18+
- question: |
19+
What is Content Understanding
20+
answer: |
21+
Content Understanding is a new Azure AI Service designed to generate structured insights from unstructured content using artificial intelligence. It provides consistent experience to extract content or a structured schema from audio, video, images, documents, or text inputs.
22+
- question: |
23+
How does Content Understanding work?
24+
answer: |
25+
Content Understanding utilizes Generative AI models to analyze and interpret various forms of unstructured content. It integrates data from different modalities (for example, text, images, audio) to generate a cohesive and structured output. The service uses machine learning models trained on diverse datasets and generative AI models to ensure high accuracy and relevance in the insights provided.
26+
- question: |
27+
What types of unstructured content can Content Understanding process?
28+
answer: |
29+
Content Understanding can process a wide range of unstructured content, including but not limited to:
30+
* Audio recordings
31+
* Video content
32+
* Documents
33+
* Text content
34+
* Images
35+
- question: |
36+
What are the key benefits of using Content Understanding?
37+
answer: |
38+
The key benefits of using Content Understanding include:
39+
* Confidence scores: Ensure the accuracy of extracted values while minimizing the cost of human review.
40+
* Defined schema: Define a schema to ensure the extracted values align with intended use.
41+
* Quality improvements over time: The service provides capabilities to improve the quality of the schema extracted.
42+
* Improved decision-making: Structured insights help organizations make informed decisions quickly and effectively.
43+
* Increased efficiency: Automating the analysis of unstructured content saves time and reduces the manual effort required.
44+
* Scalability: The service can handle large volumes of data, making it suitable for organizations of all sizes.
45+
- question: |
46+
How can businesses use Content Understanding?
47+
answer: |
48+
Businesses can use Content Understanding in various ways, such as:
49+
* Automation: Automate processing of content to extract a defined schema. Call center, documents, and other similar scenarios.
50+
* Content cataloging: managing a large corpus of digital assets.
51+
* Customer sentiment analysis: Understanding customer feedback from reviews, social media, and support interactions.
52+
* Market research: Analyzing trends and patterns from diverse data sources to inform business strategies.
53+
* Operational insights: Gain insights from internal documents, emails, and other unstructured data to improve operations.
54+
- question: |
55+
Is Content Understanding easy to integrate with existing systems?
56+
answer: |
57+
Yes, Content Understanding easily integrates with existing systems and workflows. The service offers a set of easy-to-use APIs that can be integrated into any application.
58+
- question: |
59+
What security measures are in place to protect data processed by Content Understanding?
60+
answer: |
61+
Azure AI Services, including Content Understanding, adheres to strict security and compliance standards to ensure data protection. These measures include data encryption, secure access controls, and compliance with industry regulations such as GDPR and HIPAA. The service also adheres to Microsoft’s responsible use of AI.
62+
- question: |
63+
What base models does Azure AI Content Understanding use?
64+
answer: |
65+
Content Understanding uses various models and capabilities from Azure OpenAI, Azure AI Speech, Vision, and Language to support single- modality and multi-modal scenarios. The service determines the selection of base models appropriate for each scenario.
66+
- question: |
67+
What are the pricing tier options for Content Understanding
68+
answer: |
69+
Content Understanding only supports Standard S0 pricing tier. See more details on the pricing page.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
---
2+
title: Content Understanding Glossary
3+
titleSuffix: Azure AI services
4+
description: Quick reference, detailed description on Content Understanding Terms and Definition
5+
author: laujan
6+
manager: nitinme
7+
ms.service: azure
8+
ms.topic: conceptual
9+
ms.date: 11/19/2024
10+
ms.author: lajanuar
11+
---
12+
13+
# Content understanding terminologies
14+
15+
| Term | Description |
16+
|:---------|:----------|
17+
| **File** | Any type of data, including text, documents, images, videos, and audio. |
18+
| **File type** | The MIME type of a file, such as text/plain, application/pdf, image/jpeg, audio/wav, and video/mp4. Generic categories like *document* refer to all corresponding MIME types supported by the service. |
19+
| **Analyzer** | A component that processes and extracts content and structured fields from files. Content Understanding offers a few analyzer templates for common scenarios. |
20+
| **Analyzer template** | A predefined configuration and field schema for an analyzer. It simplifies creating analyzers by allowing modifications to a template instead of starting from scratch. This feature is available only in AI Foundry, not via REST API/SDKs. |
21+
| **Analyzer result** | The output generated by an analyzer after processing input data. It typically includes extracted content in Markdown, extracted fields, and optional modality-specific details. |
22+
| **Add-ons** | Added features that enhance content extraction results, such as layout elements, barcodes, and figures in documents. |
23+
| **Fields** | List of structured key-value pairs derived from the content, as defined by the field schema. [Learn more about supported field value types.](service-limits.md) |
24+
| **Field schema** | A formal description of the fields to extract from the input. It specifies the name, description, value type, generation method, and more for each field. |
25+
| **Generation method** | The process of determining the extracted value of a specified field. Content Understanding supports: <br/> &bullet; **Extract**: Directly extract values from the input content, such as dates from receipts or item details from invoices. <br/> &bullet; **Classify**: Classify content into predefined categories, such as call sentiment or chart type. <br/> &bullet; **Generate**: Generate values from input data, such as summarizing an audio conversation or generating scene descriptions from videos. |
26+
| **Span** | A reference indicating the location of an element (for example, field, word) within the extracted Markdown content. A character offset and length represent a span. Different programming languages use various character encodings, which can affect the exact offset and length values for Unicode text. To avoid confusion, spans are only returned if the desired encoding is explicitly specified in the request. Some elements can map to multiple spans if they aren't contiguous in the markdown (for example, page). |
27+
| **Grounding source** | The specific regions in content where a value was generated. It has different representations depending on the file type: <br>&bullet; **Image** - A polygon in the image, often an axis-aligned rectangle (bounding box). <br>&bullet; **PDF/TIFF** - A polygon on a specific page, often a quadrilateral. <br>&bullet; **Audio** - A start and end time range. <br>&bullet; **Video** - A start and end time range with an optional polygon in each frame, often a bounding box.|
28+
| **Confidence score** | The level of certainty that the extracted data is accurate. |

0 commit comments

Comments
 (0)