Skip to content

Commit 90b3cad

Browse files
authored
Merge pull request #1457 from laujan/content-understanding-paul-hsu-updates
Content understanding paul hsu updates
2 parents c4a98e2 + ec8704a commit 90b3cad

30 files changed

+794
-866
lines changed

articles/ai-services/content-understanding/audio/overview.md

Lines changed: 33 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -20,128 +20,67 @@ ms.custom: ignite-2024-understanding-release
2020
> * Features, approaches, and processes may change or have constrained capabilities, prior to General Availability (GA).
2121
> * For more information, *see* [**Supplemental Terms of Use for Microsoft Azure Previews**](https://azure.microsoft.com/support/legal/preview-supplemental-terms).
2222
23-
Content Understanding audio capabilities enable you to transcribe and diarize conversational audio. It can generate enhanced outputs like summaries, special industry record formats, captioning data. Content Understanding audio and audio capabilities enable you to extract valuable information such as key topics, sentiment, and more. To get started, use one of the provided out-of-box prebuilt extraction schemas and start generating results. You can also customize Content Understanding capabilities to meet your business needs as necessary.
23+
Content Understanding audio analyzers enable transcription and diarization of conversational audio, extracting structured fields such as summaries, sentiments, and key topics. Customize an audio analyzer template to your business needs using [Azure AI Foundry](https://ai.azure.com/) to start generating results.
2424

25-
Here are some of the common scenarios for Content Understanding extracted conversational audio data:
25+
Here are common scenarios for using Content Understanding with conversational audio data:
2626

27-
* Get customer insights through summarization and sentiment.
27+
* Gain customer insights through summarization and sentiment analysis.
28+
* Assess and verify call quality and compliance in call centers.
29+
* Create automated summaries and metadata for podcast publishing.
2830

29-
* Generate contact center call analytics results.
31+
## Audio analyzer capabilities
3032

31-
* Assess and verify contact center call quality and compliance for improved processing coverage.
33+
:::image type="content" source="../media/audio/overview/workflow-diagram.png" lightbox="../media/audio/overview/workflow-diagram.png" alt-text="Illustration of Content Understanding audio workflow.":::
3234

33-
* Generate automated summaries and metadata for podcast platform publishing.
35+
Content Understanding serves as a cornerstone for Media Asset Management solutions, enabling the following capabilities for audio files:
36+
37+
### Content extraction
3438

35-
* Create a redacted version of the transcript with personal data removed.
39+
* **Transcription**. Converts conversational audio into searchable and analyzable text-based transcripts in WebVTT format. Customizable fields can be generated from transcription data. Sentence-level and word-level timestamps are available upon request.
3640

37-
* Analyze recordings to find valuable information like most desired topics.
41+
* **`Diarization`**. Distinguishes between speakers in a conversation, attributing parts of the transcript to specific speakers.
3842

39-
* Generate rich outputs based on conversational audio such as dictated documents.
43+
* **Speaker role detection**. Identifies agent and customer roles within contact center call data.
4044

41-
## Content Understanding in AI Studio
45+
* **Language detection**. Automatically detects the language in the audio or uses specified language/locale hints.
4246

43-
AI studio enables you to set up, test, and manage Content Understanding solutions. You can use prebuilt schemas that can be customized to analyze your audio transcripts to easily generate results matching your specific business needs. A typical scenario is to automatically process files uploaded into a blob storage account and write the analytics results back to it. Based on the single file analysis, you can then easily index and add these results to a database or an Azure AI Search Index to easily generate more cross-recording insights and dashboards.
47+
### Field extraction
4448

45-
* Get insights from audio recordings of meetings, calls, and conversations. Review insights from summaries, sentiment results, action items, meeting notes, and `PII` redacted transcripts.
49+
Field extraction allows you to extract structured data from audio files, such as summaries, sentiments, and mentioned entities from call logs. You can begin by customizing a suggested analyzer template or creating one from scratch.
4650

47-
* Customize the results according to your specific needs and scenarios to modify the output of the workflow.
48-
49-
* Test and deploy customized workflows easily and quickly, without having to write any code or use any external tools.
50-
51-
* Access and manage your Content Understanding projects and resources in one place, along with other AI services that you use in AI Studio.
52-
53-
You can use the AI Studio to manage audio analytics projects and resources.
54-
55-
* Content Understanding in AI studio offers a user-friendly interface and a seamless setup experience to generate insights from audio data. You can also test and deploy different versions of the output schema directly in AI studio.
56-
57-
* Developers can use the `SDK`s and `REST API`s to process data at scale in production and integrate Content Understanding into Azure Pipelines as needed.
58-
59-
## Content Understanding features for audio processing
60-
61-
Content Understanding is serves as a cornerstone for Media Asset Management solutions and enables the following capabilities for audio files:
62-
63-
* **Extracting content**:
64-
65-
* **Transcription**. Convert audio within conversational audio files into text-based transcripts that can be searched and analyzed. This transcription data is also used as grounding for generating customizable fields.
66-
67-
* **Diarization**. Speaker diarization distinguishes between the speakers participating in a conversation. The Content Understanding service provides information about which part of a transcribed conversation is attributed to a particular speaker.
68-
69-
* **Speaker Role Detection**. Detect and identify agent and customer speaker roles within contact center call data.
70-
71-
* **Supported languages**. Content Understanding audio capabilities support automatic language detection for [**supported languages**](../language-region-support.md#language-support). The feature is automatically active if no locale or multiple locales are selected.
72-
73-
* **Supported audio formats**. Content Understanding audio capabilities support a broad variety of [audio file formats and codes](../language-region-support.md).
74-
75-
* **Audio transcription detailed output**. The complete output from the audio transcription process is returned including, if needed, sentence-level and word-level-timestamps.
76-
77-
* **Generating fields**:
78-
79-
* **Field generation**. Content Understanding enables you to define custom fields and extract and generate data from your audio by including them in the schema definition.
80-
81-
* **Multi-language results**. Content Understanding can generate field schema results in multiple languages when you include a field description in the desired output language.
82-
83-
* **Support for `generate` and `classify` methods for field extraction**. Customize your output formats using user-specified extraction methods.
84-
85-
## Content Understanding audio workflow
86-
87-
The following diagram provides a high-level overview of a typical Content Understanding Audio processing workflow.
88-
89-
:::image type="content" source="../media/audio/overview/workflow-diagram.png" lightbox="../media/audio/overview/workflow-diagram.png" alt-text="Illustration of Content Understanding audio workflow.":::
90-
91-
A typical Content Understanding Audio workflow consists of the following steps:
92-
93-
1. You send audio or transcription files to the Content UnderstandingAPI wither as single file or providing settings to process from a connected blob storage account.
51+
## Key Benefits
52+
Content Understanding offers advanced audio capabilities, including:
9453

95-
1. Content UnderstandingContent Extraction generates a conversation transcript incl. speaker separation in webVTT format and optionally recognizes speaker roles or names to replace generic 'Speaker n' results.
54+
* **Customizable data extraction**. Tailor the output to your specific needs by modifying the field schema, allowing for precise data generation and extraction.
9655

97-
1. The Content UnderstandingField Extraction then generates added insights based on the generated conversation transcript.
56+
* **Generative models**. Utilize generative AI models to specify in natural language the content you want to extract, and the service generates the desired output.
9857

99-
1. The Content Understanding service returns an audio file results containing the conversation transcript including added generated insights in JSON format. The results are either directly returned from the API or can be written into a connected blob storage account.
58+
* **Integrated pre-processing**. Benefit from built-in preprocessing steps like transcription, diarization, and role detection, providing rich context for generative models.
10059

101-
## Content Understanding prebuilt audio scenarios
60+
* **Scenario adaptability**. Adapt the service to your requirements by generating custom fields and extract relevant data.
10261

103-
Content Understanding provides the following customizable prebuilt scenario templates:
62+
## Content Understanding audio analyzer templates
10463

105-
* **Post call analytics**. Analyze call recordings and generate outputs such as conversation transcript, call summary, sentiment assessment and more.
64+
Content Understanding offers customizable audio analyzer templates:
10665

107-
* **Conversation summarization**. Generate transcriptions from conversation audio recordings, generate a summary, and assess sentiment.
66+
* **Post-call analytics**. Analyze call recordings to generate conversation transcripts, call summaries, sentiment assessments, and more.
10867

109-
You can start with any prebuilt scenario or start from scratch to get started and customize as needed to meet your business needs.
68+
* **Conversation summarization**. Generate transcriptions, summaries, and sentiment assessments from conversation audio recordings.
11069

111-
## Audio format support and input requirements
70+
Start with a template or create a custom analyzer to meet your specific business needs.
11271

113-
For a complete list of Content Understanding supported audio formats, *see* our [Service limits and codecs](../service-limits.md) page.
72+
## Input requirements
73+
For a detailed list of supported audio formats, refer to our [Service limits and codecs](../service-limits.md) page.
11474

115-
## Supported regions, languages, and locales
75+
## Supported languages and regions
11676

11777
For a complete list of supported regions, languages, and locales, see our [Language and region support](../language-region-support.md)) page.
11878

119-
120-
## Content Understanding audio capability limits
121-
122-
|Attribute|Limit|
123-
|-----|-----|
124-
|Time|Maximum of 2 hours in length|
125-
|Size|Maximum of 200 MB in size|
126-
|Speakers|Maximum number of 36 speakers|
127-
128-
## Key Benefits
129-
130-
Content Understanding provides a specific set of capabilities for audio including:
131-
132-
* **Highly customizable data extraction**. Unlike traditional audio analysis services, Content Understanding allows you to customize the data you want to generate or extract. By modifying the schema, you can tailor the output to match your specific use cases.
133-
134-
* **Generative Models**. You can use our generative AI models to describe in natural language what content you want to extract, and the service generates that output.
135-
136-
* **Integrated Pre-processing**. The service performs several preprocessing steps, such as transcription, diarization and role detection, to provide rich context to the generative models.
137-
138-
* **Scenarios adaptability**. The service can adapt to your needs by generating custom fields to extract the right data.
139-
14079
## Data privacy and security
14180

142-
As with all the Azure AI services, developers using the Content Understanding service should be aware of Microsoft's policies on customer data. See our [**Data, protection and privacy**](https://www.microsoft.com/trust-center/privacy) page to learn more.
81+
Developers using Content Understanding should review Microsoft's policies on customer data. For more information, visit our [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy) page.
14382

14483
## Next steps
14584

146-
To get started using Content Understanding audio capabilities, try our [post-call analytics prebuilt scenario template](../prebuilt-template/post-call-analytics.md).
147-
85+
* Try processing your audio content using Content Understanding in [Azure AI Foundry](https://ai.azure.com/).
86+
* Learn more about audio [**analyzer templates**](../quickstart/use-ai-foundry.md).

articles/ai-services/content-understanding/concept/content-extraction.md

Lines changed: 0 additions & 63 deletions
This file was deleted.

0 commit comments

Comments
 (0)