You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/audio/overview.md
+25-24Lines changed: 25 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,12 @@
2
2
title: Azure AI Content Understanding audio overview
3
3
titleSuffix: Azure AI services
4
4
description: Learn about Azure AI Content Understanding audio solutions
5
-
author: goergenj
6
-
ms.author: lajanuar
5
+
author: laujan
6
+
ms.author: goergenj
7
7
manager: nitinme
8
8
ms.service: azure-ai-content-understanding
9
9
ms.topic: overview
10
-
ms.date: 05/06/2025
11
-
ms.custom: release-preview-2-cu
10
+
ms.date: 05/19/2025
12
11
---
13
12
14
13
# Content Understanding audio solutions (preview)
@@ -31,7 +30,7 @@ Here are common scenarios for conversational audio data processing:
31
30
:::image type="content" source="../media/audio/overview/workflow-diagram.png" lightbox="../media/audio/overview/workflow-diagram.png" alt-text="Illustration of Content Understanding audio capabilities.":::
32
31
33
32
Content Understanding serves as a cornerstone for Speech Analytics solutions, enabling the following capabilities for audio files:
34
-
33
+
35
34
### Content extraction
36
35
37
36
#### Language handling
@@ -55,7 +54,7 @@ For languages with Fast transcriptions support and for files ≤ 300MB and/or
55
54
56
55
***Speaker role detection**. Identifies agent and customer roles within contact center call data.
57
56
58
-
***Multilingual transcription**. Generates multilingual transcripts, applying language/locale per phrase. Deviating from language detection this feature is enabled when no language/locale is specified or language is set to 'auto'.
57
+
***Multilingual transcription**. Generates multilingual transcripts, applying language/locale per phrase. Deviating from language detection this feature is enabled when no language/locale is specified or language is set to `auto`.
59
58
60
59
> [!NOTE]
61
60
> When Multilingual transcription is used, a file with an unsupported locale produces a result. This result is based on the closest locale but most likely not correct.
The prebuild analyzers allow extracting valuable insights into audio content without the need to create an analyzer setup.
82
+
The prebuilt analyzers allow extracting valuable insights into audio content without the need to create an analyzer setup.
84
83
85
84
All audio analyzers generate transcripts in standard WEBVTT format separated by speaker.
86
85
87
86
> [!NOTE]
88
-
> Prebuild analyzers are set to use multilingual transcription and returnDetails enabled!
87
+
>
88
+
> Prebuilt analyzers are set to use multilingual transcription and `returnDetails` enabled.
89
89
90
90
The following prebuild analyzers are available:
91
91
92
92
**Post-call analysis (prebuilt-callCenter)**. Analyze call recordings to generate:
93
-
- conversation transcripts with speaker role detection result
94
-
- call summary
95
-
- call sentiment
96
-
- top five articles mentioned
97
-
- list of companies mentioned
98
-
- list of people (name and title/role) mentioned
99
-
- list of relevant call categories
100
-
101
-
**Example result:**
93
+
94
+
* conversation transcripts with speaker role detection result
95
+
* call summary
96
+
* call sentiment
97
+
* top five articles mentioned
98
+
* list of companies mentioned
99
+
* list of people (name and title/role) mentioned
100
+
* list of relevant call categories
101
+
102
+
**Example result:**
102
103
```json
103
104
{
104
105
"id": "bc36da27-004f-475e-b808-8b8aead3b566",
@@ -217,7 +218,7 @@ The following prebuild analyzers are available:
217
218
- conversation transcripts
218
219
- conversation summary
219
220
220
-
**Example result:**
221
+
**Example result:**
221
222
```json
222
223
{
223
224
"id": "9624cc49-b6b3-4ce5-be6c-e895d8c2484d",
@@ -262,11 +263,11 @@ The following prebuild analyzers are available:
262
263
}
263
264
```
264
265
265
-
You can also customize prebuild analyzers for more fine-grained control of the output by defining custom fields. Customization allows you to use the full power of generative models to extract deep insights from the audio. For example, customization allows you to:
266
-
- Generate other insights
267
-
- Control the language of the field extraction output
268
-
- Configure the transcription behavior
269
-
- and more
266
+
You can also customize prebuilt analyzers for more fine-grained control of the output by defining custom fields. Customization allows you to use the full power of generative models to extract deep insights from the audio. For example, customization allows you to:
267
+
268
+
* Generate other insights.
269
+
* Control the language of the field extraction output.
For an end-2-end quickstart for Speech Analytics solutions, refer to the [Conversation knowledge mining solution accelerator](https://aka.ms/Conversational-Knowledge-Mining).
@@ -293,4 +294,4 @@ Developers using this service should review Microsoft's policies on customer dat
293
294
* Try processing your audio content in [**Azure AI Foundry portal**](https://aka.ms/cu-landing).
294
295
* Learn how to analyze audio content [**analyzer templates**](../quickstart/use-ai-foundry.md).
0 commit comments