Skip to content

Commit abf0e00

Browse files
committed
update audio overview
1 parent 81590f8 commit abf0e00

File tree

1 file changed

+22
-17
lines changed
  • articles/ai-services/content-understanding/audio

1 file changed

+22
-17
lines changed

articles/ai-services/content-understanding/audio/overview.md

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Azure AI Content Understanding audio overview
33
titleSuffix: Azure AI services
44
description: Learn about Azure AI Content Understanding audio solutions
55
author: laujan
6-
ms.author: lajanuar
6+
ms.author: jagoerge
77
manager: nitinme
88
ms.service: azure-ai-content-understanding
99
ms.topic: overview
@@ -33,20 +33,22 @@ Content Understanding serves as a cornerstone for Speech Analytics solutions, en
3333

3434
### Content extraction
3535

36+
Audio content extraction is the process of isolating and retrieving specific elements or features from an audio file. This process can include separating individual audio sources; identifying specific segments within a sound file; or detecting and categorizing various characteristics of the audio content.
37+
3638
#### Language handling
3739
We support different options to handle language processing during transcription.
3840

3941
The following table provides an overview of the options controlled via the 'locales' configuration:
4042

4143
|Locale setting|File size|Supported processing|Supported locales|Result latency|
4244
|--|--|--|--|--|
43-
|auto or empty|300MB and/or ≤ 2 hours|Multilingual transcription|de-DE, en-AU, en-CA, en-GB, en-IN, en-US, es-ES, es-MX, fr-CA, fr-FR, hi-IN, it-IT, ja-JP, ko-KR and zh-CN|Near-real-time|
44-
|auto or empty|> 300MB and >2hr ≤ 4 hours|Multilingual transcription|en-US, es-ES, es-MX, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, pt-BR, zh-CN|Regular|
45-
|single locale|1GB and/or ≤ 4 hours|Single language transcription|All supported locales[^1]|&bullet;300MB and/or ≤ 2 hours: Near-real-time<br>&bullet; > 300MB and >2hr ≤ 4 hours: Regular|
46-
|multiple locales|1GB and/or ≤ 4 hours|Single language transcription<br>based on Language Detection|All supported locales[^1]|&bullet;300MB and/or ≤ 2 hours: Near-real-time<br>&bullet; > 300MB and >2hr ≤ 4 hours: Regular|
45+
|**auto or empty**|300 MB and/or ≤ 2 hours|Multilingual transcription|`de-DE`, `en-AU`,` en-CA`, `en-GB`, `en-IN`, `en-US`, `es-ES`, `es-MX`, `fr-CA`, `fr-FR`, `hi-IN`, `it-IT`, `ja-JP`, `ko-KR`, and `zh-CN`|Near-real-time|
46+
|**auto or empty**|> 300 MB and >2 HR ≤ 4 hours|Multilingual transcription|`en-US`, `es-ES`, `es-MX`, `fr-FR`, `hi-IN`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, `zh-CN`|Regular|
47+
|**single locale**|1 GB and/or ≤ 4 hours|Single language transcription|All supported locales[^1]|&bullet;300 MB and/or ≤ 2 hours: Near-real-time<br>&bullet; > 300 MB and >2 HR ≤ 4 hours: Regular|
48+
|**multiple locales**|1 GB and/or ≤ 4 hours|Single language transcription (based on language detection)|All supported locales[^1]|&bullet;300 MB and/or ≤ 2 hours: Near-real-time<br>&bullet; > 300 MB and >2 HR ≤ 4 hours: Regular|
4749

48-
[^1]: Content Understanding supports the full set of [Azure AI Speech Speech to text languages](../../speech-service/language-support?tabs=stt).
49-
For languages with Fast transcriptions support and for files ≤ 300MB and/or ≤ 2 hours, transcription time is reduced substantially.
50+
[^1]: Content Understanding supports the full set of [Azure AI Speech Speech to text languages](../../speech-service/language-support.md).
51+
For languages with Fast transcriptions support and for files ≤ 300 MB and/or ≤ 2 hours, transcription time is reduced substantially.
5052

5153
* **Transcription**. Converts conversational audio into searchable and analyzable text-based transcripts in WebVTT format. Customizable fields can be generated from transcription data. Sentence-level and word-level timestamps are available upon request.
5254

@@ -66,7 +68,8 @@ For languages with Fast transcriptions support and for files ≤ 300MB and/or
6668

6769
Field extraction allows you to extract structured data from audio files, such as summaries, sentiments, and mentioned entities from call logs. You can begin by customizing a suggested analyzer template or creating one from scratch.
6870

69-
## Key Benefits
71+
## Key benefits
72+
7073
Advanced audio capabilities, including:
7174

7275
* **Customizable data extraction**. Tailor the output to your specific needs by modifying the field schema, allowing for precise data generation and extraction.
@@ -77,7 +80,7 @@ Advanced audio capabilities, including:
7780

7881
* **Scenario adaptability**. Adapt the service to your requirements by generating custom fields and extract relevant data.
7982

80-
## Prebuild audio analyzers
83+
## Prebuilt audio analyzers
8184

8285
The prebuilt analyzers allow extracting valuable insights into audio content without the need to create an analyzer setup.
8386

@@ -87,7 +90,7 @@ All audio analyzers generate transcripts in standard WEBVTT format separated by
8790
>
8891
> Prebuilt analyzers are set to use multilingual transcription and `returnDetails` enabled.
8992
90-
The following prebuild analyzers are available:
93+
The following prebuilt analyzers are available:
9194

9295
**Post-call analysis (prebuilt-callCenter)**. Analyze call recordings to generate:
9396

@@ -279,19 +282,21 @@ Capabilities such as topic modeling, key phrase extraction, speech-to-text trans
279282
Analysts working with large volumes of conversational data can use this solution to extract insights through natural language interaction. It supports tasks like identifying customer support trends, improving contact center quality, and uncovering operational intelligence—enabling teams to spot patterns, act on feedback, and make informed decisions faster.
280283

281284
## Input requirements
282-
For a detailed list of supported audio formats, refer to our [Service limits and codecs](../service-limits.md) page.
285+
286+
For a detailed list of supported audio formats, *see* [Service limits and codecs](../service-limits.md).
283287

284288
## Supported languages and regions
285289

286-
For a complete list of supported regions, languages, and locales, see our [Language and region support](../language-region-support.md)) page.
290+
For a complete list of supported regions, languages, and locales, see [Language and region support](../language-region-support.md).
287291

288292
## Data privacy and security
289293

290-
Developers using this service should review Microsoft's policies on customer data. For more information, visit our [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy) page.
294+
Developers using this service should review Microsoft's policies on customer data. For more information, *see* [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy).
291295

292296
## Next steps
293297

294-
* Try processing your audio content in [**Azure AI Foundry portal**](https://aka.ms/cu-landing).
295-
* Learn how to analyze audio content [**analyzer templates**](../quickstart/use-ai-foundry.md).
296-
* Review code sample: [**audio content extraction**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/blob/main/notebooks/content_extraction.ipynb).
297-
* Review code sample: [**analyzer templates**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/tree/main/analyzer_templates).
298+
* Try processing your audio content in the [**Azure AI Foundry portal**](https://aka.ms/cu-landing).
299+
* Learn how to analyze audio content with [**analyzer templates**](../quickstart/use-ai-foundry.md).
300+
* Review code samples:
301+
* [**audio content extraction**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/blob/main/notebooks/content_extraction.ipynb).
302+
* [**analyzer templates**](https://github.com/Azure-Samples/azure-ai-content-understanding-python/tree/main/analyzer_templates).

0 commit comments

Comments
 (0)