Skip to content

Commit 3dec33b

Browse files
authored
Merge branch 'release-preview-2-cu' into joe-4669-video-overview
2 parents 6ecb433 + 95a2fbb commit 3dec33b

File tree

5 files changed

+281
-68
lines changed

5 files changed

+281
-68
lines changed

articles/ai-foundry/how-to/evaluation-github-action.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,9 @@ jobs:
107107
run-action:
108108
runs-on: ubuntu-latest
109109
steps:
110+
- name: Checkout
111+
uses: actions/checkout@v4
112+
110113
- name: Azure login using Federated Credentials
111114
uses: azure/login@v2
112115
with:
@@ -115,7 +118,7 @@ jobs:
115118
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
116119

117120
- name: Run Evaluation
118-
uses: microsoft/ai-agent-evals@v1
121+
uses: microsoft/ai-agent-evals@v1-beta
119122
with:
120123
# Replace placeholders with values for your Azure AI Project
121124
azure-aiproject-connection-string: "<your-ai-project-conn-str>"

articles/ai-services/content-understanding/audio/overview.md

Lines changed: 213 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,11 @@ title: Azure AI Content Understanding audio overview
33
titleSuffix: Azure AI services
44
description: Learn about Azure AI Content Understanding audio solutions
55
author: laujan
6-
ms.author: lajanuar
6+
ms.author: jagoerge
77
manager: nitinme
88
ms.service: azure-ai-content-understanding
99
ms.topic: overview
1010
ms.date: 05/19/2025
11-
ms.custom: ignite-2024-understanding-release
1211
---
1312

1413

@@ -33,16 +32,34 @@ Here are common scenarios for using Content Understanding with conversational au
3332
:::image type="content" source="../media/audio/overview/workflow-diagram.png" lightbox="../media/audio/overview/workflow-diagram.png" alt-text="Illustration of Content Understanding audio workflow.":::
3433

3534
Content Understanding serves as a cornerstone for Media Asset Management solutions, enabling the following capabilities for audio files:
36-
35+
3736
### Content extraction
3837

3938
* **Transcription**. Converts conversational audio into searchable and analyzable text-based transcripts in WebVTT format. Customizable fields can be generated from transcription data. Sentence-level and word-level timestamps are available upon request.
4039

41-
* **`Diarization`**. Distinguishes between speakers in a conversation, attributing parts of the transcript to specific speakers.
40+
> [!NOTE]
41+
>
42+
> Content Understanding supports the full set of [Azure AI Speech Speech to text languages](../../speech-service/language-support.md).
43+
> For languages with fast transcriptions support and for files ≤ 300 MB and/or ≤ 2 hours, transcription time is reduced substantially.
44+
45+
* **Diarization**. Distinguishes between speakers in a conversation, attributing parts of the transcript to specific speakers.
4246

4347
* **Speaker role detection**. Identifies agent and customer roles within contact center call data.
4448

45-
* **Language detection**. Automatically detects the language in the audio or uses specified language/locale hints.
49+
* **Multilingual transcription**. Generates multilingual transcripts, applying language/locale per phrase. Deviating from language detection this feature is enabled when no language/locale is specified or language is set to `auto`.
50+
51+
> [!NOTE]
52+
>
53+
> The following locales are currently supported:
54+
> * **Files ≤ 300 MB and/or ≤ 2 hours**: de-DE, en-AU, en-CA, en-GB, en-IN, en-US, es-ES, es-MX, fr-CA, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, and zh-CN.
55+
> * **Files larger than 300 MB and/or longer than 4 hours**: en-US, es-ES, es-MX, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, pt-BR, zh-CN.
56+
57+
* **Language detection**. Automatically detects the dominant language/locale which is used to transcribe the file. Set multiple languages/locales to enable language detection.
58+
59+
> [!NOTE]
60+
>
61+
> For files larger than 300 MB and/or longer than 2 hours and locales unsupported by Fast transcription, the file is processed generating a multilingual transcript based on the specified locales.
62+
> In case language detection fails, the first language/locale defined is used to transcribe the file.
4663
4764
### Field extraction
4865

@@ -59,15 +76,197 @@ Content Understanding offers advanced audio capabilities, including:
5976

6077
* **Scenario adaptability**. Adapt the service to your requirements by generating custom fields and extract relevant data.
6178

62-
## Content Understanding audio analyzer templates
63-
64-
Content Understanding offers customizable audio analyzer templates:
65-
66-
* **Post-call analysis**. Analyze call recordings to generate conversation transcripts, call summaries, sentiment assessments, and more.
67-
68-
* **Conversation analysis**. Generate transcriptions, summaries, and sentiment assessments from conversation audio recordings.
69-
70-
Start with a template or create a custom analyzer to meet your specific business needs.
79+
## Content Understanding prebuilt audio analyzers
80+
81+
The prebuilt analyzers allow extracting valuable insights into audio content without the need to create an analyzer setup.
82+
83+
All audio analyzers generate transcripts in standard WEBVTT format separated by speaker.
84+
85+
> [!NOTE]
86+
>
87+
> Prebuilt analyzers are set to use multilingual transcription and `returnDetails` enabled.
88+
89+
Content Understanding offers the following prebuilt analyzers:
90+
91+
**Post-call analysis (prebuilt-callCenter)**. Analyze call recordings to generate:
92+
93+
* conversation transcripts with speaker role detection result
94+
* call summary
95+
* call sentiment
96+
* top five articles mentioned
97+
* list of companies mentioned
98+
* list of people (name and title/role) mentioned
99+
* list of relevant call categories
100+
101+
**Example result:**
102+
```json
103+
{
104+
"id": "bc36da27-004f-475e-b808-8b8aead3b566",
105+
"status": "Succeeded",
106+
"result": {
107+
"analyzerId": "prebuilt-callCenter",
108+
"apiVersion": "2025-05-01-preview",
109+
"createdAt": "2025-05-06T22:53:28Z",
110+
"stringEncoding": "utf8",
111+
"warnings": [],
112+
"contents": [
113+
{
114+
"markdown": "# Audio: 00:00.000 => 00:32.183\n\nTranscript\n```\nWEBVTT\n\n00:00.080 --> 00:00.640\n<v Agent>Good day.\n\n00:00.960 --> 00:02.240\n<v Agent>Welcome to Contoso.\n\n00:02.560 --> 00:03.760\n<v Agent>My name is John Doe.\n\n00:03.920 --> 00:05.120\n<v Agent>How can I help you today?\n\n00:05.440 --> 00:06.320\n<v Agent>Yes, good day.\n\n00:06.720 --> 00:08.160\n<v Agent>My name is Maria Smith.\n\n00:08.560 --> 00:11.280\n<v Agent>I would like to inquire about my current point balance.\n\n00:11.680 --> 00:12.560\n<v Agent>No problem.\n\n00:12.880 --> 00:13.920\n<v Agent>I am happy to help.\n\n00:14.240 --> 00:16.720\n<v Agent>I need your date of birth to confirm your identity.\n\n00:17.120 --> 00:19.600\n<v Agent>It is April 19th, 1988.\n\n00:20.000 --> 00:20.480\n<v Agent>Great.\n\n00:20.800 --> 00:24.160\n<v Agent>Your current point balance is 599 points.\n\n00:24.560 --> 00:26.160\n<v Agent>Do you need any more information?\n\n00:26.480 --> 00:27.200\n<v Agent>No, thank you.\n\n00:27.600 --> 00:28.320\n<v Agent>That was all.\n\n00:28.720 --> 00:29.280\n<v Agent>Goodbye.\n\n00:29.680 --> 00:30.320\n<v Agent>You're welcome.\n\n00:30.640 --> 00:31.840\n<v Agent>Goodbye at Contoso.\n```",
115+
"fields": {
116+
"Summary": {
117+
"type": "string",
118+
"valueString": "Maria Smith contacted Contoso to inquire about her current point balance. After confirming her identity with her date of birth, the agent, John Doe, informed her that her balance was 599 points. Maria did not require any further assistance, and the call concluded politely."
119+
},
120+
"Topics": {
121+
"type": "array",
122+
"valueArray": [
123+
{
124+
"type": "string",
125+
"valueString": "Point balance inquiry"
126+
},
127+
{
128+
"type": "string",
129+
"valueString": "Identity confirmation"
130+
},
131+
{
132+
"type": "string",
133+
"valueString": "Customer service"
134+
}
135+
]
136+
},
137+
"Companies": {
138+
"type": "array",
139+
"valueArray": [
140+
{
141+
"type": "string",
142+
"valueString": "Contoso"
143+
}
144+
]
145+
},
146+
"People": {
147+
"type": "array",
148+
"valueArray": [
149+
{
150+
"type": "object",
151+
"valueObject": {
152+
"Name": {
153+
"type": "string",
154+
"valueString": "John Doe"
155+
},
156+
"Role": {
157+
"type": "string",
158+
"valueString": "Agent"
159+
}
160+
}
161+
},
162+
{
163+
"type": "object",
164+
"valueObject": {
165+
"Name": {
166+
"type": "string",
167+
"valueString": "Maria Smith"
168+
},
169+
"Role": {
170+
"type": "string",
171+
"valueString": "Customer"
172+
}
173+
}
174+
}
175+
]
176+
},
177+
"Sentiment": {
178+
"type": "string",
179+
"valueString": "Positive"
180+
},
181+
"Categories": {
182+
"type": "array",
183+
"valueArray": [
184+
{
185+
"type": "string",
186+
"valueString": "Business"
187+
}
188+
]
189+
}
190+
},
191+
"kind": "audioVisual",
192+
"startTimeMs": 0,
193+
"endTimeMs": 32183,
194+
"transcriptPhrases": [
195+
{
196+
"speaker": "Agent",
197+
"startTimeMs": 80,
198+
"endTimeMs": 640,
199+
"text": "Good day.",
200+
"words": []
201+
}, ...
202+
{
203+
"speaker": "Customer",
204+
"startTimeMs": 5440,
205+
"endTimeMs": 6320,
206+
"text": "Yes, good day.",
207+
"words": []
208+
}, ...
209+
]
210+
}
211+
]
212+
}
213+
}
214+
```
215+
216+
**Conversation analysis (prebuilt-audioAnalyzer)**. Analyze recordings to generate:
217+
- conversation transcripts
218+
- conversation summary
219+
220+
**Example result:**
221+
```json
222+
{
223+
"id": "9624cc49-b6b3-4ce5-be6c-e895d8c2484d",
224+
"status": "Succeeded",
225+
"result": {
226+
"analyzerId": "prebuilt-audioAnalyzer",
227+
"apiVersion": "2025-05-01-preview",
228+
"createdAt": "2025-05-06T23:00:12Z",
229+
"stringEncoding": "utf8",
230+
"warnings": [],
231+
"contents": [
232+
{
233+
"markdown": "# Audio: 00:00.000 => 00:32.183\n\nTranscript\n```\nWEBVTT\n\n00:00.080 --> 00:00.640\n<v Speaker 1>Good day.\n\n00:00.960 --> 00:02.240\n<v Speaker 1>Welcome to Contoso.\n\n00:02.560 --> 00:03.760\n<v Speaker 1>My name is John Doe.\n\n00:03.920 --> 00:05.120\n<v Speaker 1>How can I help you today?\n\n00:05.440 --> 00:06.320\n<v Speaker 1>Yes, good day.\n\n00:06.720 --> 00:08.160\n<v Speaker 1>My name is Maria Smith.\n\n00:08.560 --> 00:11.280\n<v Speaker 1>I would like to inquire about my current point balance.\n\n00:11.680 --> 00:12.560\n<v Speaker 1>No problem.\n\n00:12.880 --> 00:13.920\n<v Speaker 1>I am happy to help.\n\n00:14.240 --> 00:16.720\n<v Speaker 1>I need your date of birth to confirm your identity.\n\n00:17.120 --> 00:19.600\n<v Speaker 1>It is April 19th, 1988.\n\n00:20.000 --> 00:20.480\n<v Speaker 1>Great.\n\n00:20.800 --> 00:24.160\n<v Speaker 1>Your current point balance is 599 points.\n\n00:24.560 --> 00:26.160\n<v Speaker 1>Do you need any more information?\n\n00:26.480 --> 00:27.200\n<v Speaker 1>No, thank you.\n\n00:27.600 --> 00:28.320\n<v Speaker 1>That was all.\n\n00:28.720 --> 00:29.280\n<v Speaker 1>Goodbye.\n\n00:29.680 --> 00:30.320\n<v Speaker 1>You're welcome.\n\n00:30.640 --> 00:31.840\n<v Speaker 1>Goodbye at Contoso.\n```",
234+
"fields": {
235+
"Summary": {
236+
"type": "string",
237+
"valueString": "Maria Smith contacted Contoso to inquire about her current point balance. John Doe assisted her by confirming her identity using her date of birth and informed her that her balance was 599 points. Maria expressed no further inquiries, and the conversation concluded politely."
238+
}
239+
},
240+
"kind": "audioVisual",
241+
"startTimeMs": 0,
242+
"endTimeMs": 32183,
243+
"transcriptPhrases": [
244+
{
245+
"speaker": "Speaker 1",
246+
"startTimeMs": 80,
247+
"endTimeMs": 640,
248+
"text": "Good day.",
249+
"words": []
250+
}, ...
251+
{
252+
"speaker": "Speaker 2",
253+
"startTimeMs": 5440,
254+
"endTimeMs": 6320,
255+
"text": "Yes, good day.",
256+
"words": []
257+
}, ...
258+
]
259+
}
260+
]
261+
}
262+
}
263+
```
264+
265+
You can also customize prebuilt analyzers for more fine-grained control of the output by defining custom fields. Customization allows you to use the full power of generative models to extract deep insights from the audio. For example, customization allows you to:
266+
267+
* Generate other insights.
268+
* Control the language of the field extraction output.
269+
* Configure the transcription behavior.
71270

72271
## Input requirements
73272
For a detailed list of supported audio formats, refer to our [Service limits and codecs](../service-limits.md) page.

articles/ai-services/openai/concepts/provisioned-migration.md

Lines changed: 1 addition & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -234,31 +234,7 @@ An alternative approach to self-service migration is to switch the reservation p
234234
* There will be a short period of double-billing or hourly charges during the switchover from committed to hourly/reservation billing.
235235

236236
> [!IMPORTANT]
237-
> Self-service approach generates additional charges as the payment mode is switched from Committed to Hourly/Reservation. This is the characteristics of this migration approaches and customers aren't credited for these charges. Alternately, Customers can choose to use the managed migration approach described below to avoid additional charges.
238-
239-
### Managed migration
240-
241-
The managed migration approach involves the customer partnering with Microsoft to bulk-migrate all the PTU commitments in a subscription/region at the same time. It works like this:
242-
243-
1. The customer will engage their account team and request a managed migration. A migration owner from the Microsoft team will be assigned to assist the customer with migration.
244-
2. A date will be selected when all resources within each of the customers' subscriptions and regions containing current PTU commitments will be migrated from committed to hourly/reservation billing model. Multiple subscriptions and regions can be migrated on the same date.
245-
3. On the agreed-upon date:
246-
* The customer will purchase regional reservations to cover the committed PTUs that will be converted and pass the reservation information to their Microsoft migration contact.
247-
* Within 2-3 business days, all commitments will be proactively canceled and deployments previously under commitments will begin using the hourly/reservation payment model.
248-
* In the billing period after the one with the reservation purchase, the customer will receive a credit for the reservation purchase covering the portions of the commitments that were canceled, starting from the time of the reservation purchase.
249-
250-
Customers must reach out to their account teams to schedule a managed migration.
251-
252-
**Managed migration advantages:**
253-
254-
- Bulk migration of all commitments in a subscription/region is beneficial for customers with many commitments.
255-
- Seamless cost migration: No possibility of double-billing or extra hourly charges.
256-
257-
**Managed migration disadvantages:**
258-
259-
- All commitments in a subscription/region must be migrated at the same time.
260-
- Needing to coordinate a time for migration with the Microsoft team.
261-
237+
> Self-service approach generates additional charges as the payment mode is switched from Committed to Hourly/Reservation. This is the characteristics of this migration approaches and customers aren't credited for these charges.
262238
263239
## Migrating existing deployments to global or data zone provisioned
264240
Existing customers of provisioned deployments can choose to migrate to global or data zone provisioned deployments to benefit from the lower deployment minimums, granular scale increments, or differentiated pricing available for these deployment types. To learn more about how global and data zone provisioned deployments handle data processing across Azure geographies, see the Azure OpenAI deployment [data processing documentation](https://aka.ms/aoai/docs/data-processing-locations).

0 commit comments

Comments
 (0)