Skip to content

Commit d43bd03

Browse files
authored
Merge pull request #281459 from valindrae/public-preview-july24
Public preview july24
2 parents fb65574 + 6812735 commit d43bd03

17 files changed

+1328
-10
lines changed
154 KB
Loading
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Generate real-time transcripts
3+
titleSuffix: An Azure Communication Services concept document
4+
description: Provides an overview of what real-time transcription is
5+
author: kunaal
6+
ms.service: azure-communication-services
7+
ms.subservice: call-automation
8+
ms.topic: include
9+
ms.date: 07/16/2024
10+
ms.author: kpunjabi
11+
services: azure-communication-services
12+
---
13+
14+
# Generating real-time transcripts
15+
[!INCLUDE [Public Preview Disclaimer](../../includes/public-preview-include-document.md)]
16+
17+
Real-time transcriptions are a crucial component in any major business for driving improved customer service experience. Powered by Azure Communication Services and Azure AI Services integration, developers can now use real-time transcriptions through Call Automation SDKs.
18+
19+
Using the Azure Communication Services real-time transcription, you can easily integrate your Azure AI Services resource with Azure Communication Services to generate transcripts directly during the call. This eliminates the need for developers to extract audio content and deal with the overhead of converting audio into text on your side. You can store the contents of this transcript to use later on for creating a history of the call, summarizing the call to save an agent's time, and even feeding it into your training/learning modules to help improve your contact center agents' customer interactions.
20+
21+
Out of the box Microsoft utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. This model is pretrained with dialects and phonetics representing various common domains. For more information about supported languages, see [Languages and voice support for the Speech service](../../../../articles/ai-services/speech-service/language-support.md).
22+
23+
## Common use cases
24+
25+
### Improved customer experience
26+
Assist agents better understand customer needs and respond more quickly and accurately, leading to a better overall customer experience.
27+
28+
### Increased efficency
29+
Help agents focus on the conversation rather than note-taking, allowing them to handle more calls and improve productivity
30+
31+
### Context for agents
32+
Provide context to an agent before the agent picks up the call, this way the agent knows the information that the caller has provided avoiding any need for the caller to repeat their issue.
33+
34+
### Derive insights
35+
Using the transcript generated throughout the call, you can leverage other AI tools to gain live, real-time insights that will help agents and supervisors improve their interactions with customers.
36+
37+
## Sample flow of real-time transcription using Call Automation
38+
![Diagram of real-time transcription flow.](./media/transcription.png)
39+
40+
## Next Steps
41+
- Check out our how-to guide to learn [how-to use our Real-time Transcription](../../how-tos/call-automation/real-time-transcription-tutorial.md) to users.
42+
- Learn about [usage and operational logs](../analytics/logs/call-automation-logs.md) published by call automation.
43+

articles/communication-services/how-tos/call-automation/control-mid-call-media-actions.md

Lines changed: 132 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: kunaal
66
ms.topic: how-to
77
ms.service: azure-communication-services
88
ms.subservice: call-automation
9-
ms.date: 11/16/2023
9+
ms.date: 07/16/2024
1010
ms.author: kpunjabi
1111
ms.custom: public_preview
1212
services: azure-communication-services
@@ -356,3 +356,134 @@ if event.type == "Microsoft.Communication.ContinuousDtmfRecognitionStopped":
356356
app.logger.info("Tone stoped: context=%s", event.data["operationContext"])
357357
```
358358
-----
359+
360+
### Hold
361+
The hold action allows developers to temporarily pause a conversation between a participant and a system or agent. This can be useful in scenarios where the participant needs to be transferred to another agent or department or when the agent needs to consult a supervisor in the background before continuing the conversation. During this time you can choose to play audio to the participant that is on hold.
362+
363+
### [csharp](#tab/csharp)
364+
```csharp
365+
// Option 1: Hold without additional options
366+
await callAutomationClient.GetCallConnection(callConnectionId)
367+
.GetCallMedia().HoldAsync(c2Target);
368+
369+
/*
370+
// Option 2: Hold with play source
371+
PlaySource playSource = /* initialize playSource */;
372+
await callAutomationClient.GetCallConnection(callConnectionId)
373+
.GetCallMedia().HoldAsync(c2Target, playSource);
374+
375+
// Option 3: Hold with options
376+
var holdOptions = new HoldOptions(target)
377+
{
378+
OperationCallbackUri = new Uri(""),
379+
OperationContext = "holdcontext"
380+
};
381+
await callMedia.HoldAsync(holdOptions);
382+
*/
383+
```
384+
385+
### [java](#tab/java)
386+
```java
387+
// Option 1: Hold with options
388+
PlaySource playSource = /* initialize playSource */;
389+
HoldOptions holdOptions = new HoldOptions(target)
390+
.setOperationCallbackUrl(appConfig.getBasecallbackuri())
391+
.setPlaySource(playSource)
392+
.setOperationContext("holdPstnParticipant");
393+
394+
client.getCallConnection(callConnectionId).getCallMedia().holdWithResponse(holdOptions, Context.NONE);
395+
396+
/*
397+
// Option 2: Hold without additional options
398+
client.getCallConnection(callConnectionId).getCallMedia().hold(target);
399+
*/
400+
```
401+
402+
### [JavaScript](#tab/javascript)
403+
```javascript
404+
// Option 1: Hold with options
405+
const options = {
406+
playSource: playSource,
407+
operationContext: "holdUserContext",
408+
operationCallbackUrl: "URL" // replace with actual callback URL
409+
};
410+
await callMedia.hold(targetuser, options);
411+
412+
/*
413+
// Option 2: Hold without additional options
414+
await callMedia.hold(targetuser);
415+
*/
416+
```
417+
418+
### [Python](#tab/python)
419+
```python
420+
# Option 1: Hold without additional options
421+
call_connection_client.hold(target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER))
422+
423+
'''
424+
# Option 2: Hold with options
425+
call_connection_client.hold(
426+
target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER),
427+
play_source=play_source,
428+
operation_context="holdUserContext",
429+
operation_callback_url="URL" # replace with actual callback URL
430+
)
431+
'''
432+
```
433+
-----
434+
### Unhold
435+
The unhold action allows developers to resume a conversation between a participant and a system or agent that was previously paused. When the participant is taken off hold they will be able to hear the system or agent again.
436+
437+
### [csharp](#tab/csharp)
438+
``` csharp
439+
var unHoldOptions = new UnholdOptions(target)
440+
{
441+
OperationContext = "UnHoldPstnParticipant"
442+
};
443+
444+
// Option 1
445+
var UnHoldParticipant = await callMedia.UnholdAsync(unHoldOptions);
446+
447+
/*
448+
// Option 2
449+
var UnHoldParticipant = await callMedia.UnholdAsync(target);
450+
*/
451+
```
452+
453+
### [java](#tab/java)
454+
``` java
455+
// Option 1
456+
client.getCallConnection(callConnectionId).getCallMedia().unholdWithResponse(target, "unholdPstnParticipant", Context.NONE);
457+
458+
/*
459+
// Option 2
460+
client.getCallConnection(callConnectionId).getCallMedia().unhold(target);
461+
*/
462+
```
463+
464+
### [JavaScript](#tab/javascript)
465+
```javascript
466+
const unholdOptions = {
467+
operationContext: "unholdUserContext"
468+
};
469+
470+
// Option 1
471+
await callMedia.unhold(target);
472+
473+
/*
474+
// Option 2
475+
await callMedia.unhold(target, unholdOptions);
476+
*/
477+
```
478+
479+
### [Python](#tab/python)
480+
```python
481+
# Option 1
482+
call_connection_client.unhold(target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER))
483+
484+
'''
485+
# Option 2
486+
call_connection_client.unhold(target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER), operation_context="holdUserContext")
487+
'''
488+
```
489+
-----

articles/communication-services/how-tos/call-automation/includes/play-audio-how-to-js.md

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,45 @@ await callAutomationClient.getCallConnection(callConnectionId)
124124
.playToAll([ playSource ]);
125125
```
126126

127+
### Support for barge-in
128+
During scenarios where you're playing audio on loop to all participants e.g. waiting lobby you maybe playing audio to the participants in the lobby and keep them updated on their number in the queue. When you use the barge-in support, this will cancel the on-going audio and play your new message. Then if you wanted to continue playing your original audio you would make another play request.
129+
130+
```javascript
131+
// Interrupt media with text source
132+
//Option1:
133+
134+
const playSource: TextSource = { text: "Interrupt prompt", voiceName: "en-US-NancyNeural", kind: "textSource" };
135+
136+
const interruptOption: PlayToAllOptions = {
137+
loop: false,
138+
interruptCallMediaOperation: true,
139+
operationContext: "interruptOperationContext",
140+
operationCallbackUrl: process.env.CALLBACK_URI + "/api/callbacks"
141+
};
142+
143+
await callConnectionMedia.playToAll([playSource], interruptOption);
144+
145+
/*
146+
// Interrupt media with file source
147+
148+
Option2:
149+
150+
const playSource: FileSource = {
151+
url: MEDIA_URI + "MainMenu.wav",
152+
kind: "fileSource"
153+
};
154+
155+
const interruptOption: PlayToAllOptions = {
156+
loop: false,
157+
interruptCallMediaOperation: true,
158+
operationContext: "interruptOperationContext",
159+
operationCallbackUrl: process.env.CALLBACK_URI + "/api/callbacks"
160+
};
161+
162+
await callConnectionMedia.playToAll([playSource], interruptOption);
163+
*/
164+
```
165+
127166
## Play audio - Specific participant
128167

129168
In this scenario, audio is played to a specific participant.
@@ -155,7 +194,7 @@ If you're playing the same audio file multiple times, your application can provi
155194
const playSource: FileSource = { url: audioUri, playsourcacheid: "<playSourceId>", kind: "fileSource" };
156195
await callAutomationClient.getCallConnection(callConnectionId)
157196
.getCallMedia()
158-
.play([ playSource ], [ targetParticipant ]);
197+
.play([ playSource ], [ targetParticipant ]);
159198
```
160199

161200
## Handle play action event updates
@@ -177,6 +216,12 @@ if (event.type === "Microsoft.Communication.PlayFailed") {
177216
console.log("Play failed: data=%s", JSON.stringify(eventData));
178217
}
179218
```
219+
### Example of how you can deserialize the *PlayStarted* event:
220+
```javascript
221+
if (event.type === "Microsoft.Communication.PlayStarted") {
222+
console.log("Play started: data=%s", JSON.stringify(eventData));
223+
}
224+
```
180225

181226
To learn more about other supported events, visit the [Call Automation overview document](../../../concepts/call-automation/call-automation.md#call-automation-webhook-events).
182227

articles/communication-services/how-tos/call-automation/includes/play-audio-how-to-python.md

Lines changed: 78 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,20 @@ To play audio to participants using audio files, you need to make sure the audio
8484

8585
``` python
8686
play_source = FileSource(url=audioUri)
87+
88+
#Play multiple audio files
89+
#file_source1 = FileSource(MAIN_MENU_PROMPT_URI)
90+
#file_source2 = FileSource(MAIN_MENU_PROMPT_URI)
91+
#
92+
# play_sources = [file_source1, file_source2]
93+
#
94+
# call_connection_client.play_media_to_all(
95+
# play_source=play_sources,
96+
# interrupt_call_media_operation=False,
97+
# operation_context="multiplePlayContext",
98+
# operation_callback_url=CALLBACK_EVENTS_URI,
99+
# loop=False
100+
# )
87101
```
88102

89103
### Play source - Text-To-Speech
@@ -100,7 +114,21 @@ play_source = TextSource(
100114
play_to = [target_participant]
101115
call_automation_client.get_call_connection(call_connection_id).play_media(
102116
play_source=play_source, play_to=play_to
103-
)
117+
)
118+
119+
#Multiple text prompts
120+
#play_source1 = TextSource(text="Hi, This is multiple play source one call media test.", source_locale="en-US", voice_kind=VoiceKind.FEMALE)
121+
#play_source2 = TextSource(text="Hi, This is multiple play source two call media test.", source_locale="en-US", voice_kind=VoiceKind.FEMALE)
122+
#
123+
#play_sources = [play_source1, play_source2]
124+
#
125+
#call_connection_client.play_media_to_all(
126+
# play_source=play_sources,
127+
# interrupt_call_media_operation=False,
128+
# operation_context="multiplePlayContext",
129+
# operation_callback_url=CALLBACK_EVENTS_URI,
130+
# loop=False
131+
#)
104132
```
105133

106134
``` python
@@ -112,6 +140,20 @@ play_to = [target_participant]
112140
call_automation_client.get_call_connection(call_connection_id).play_media(
113141
play_source=play_source, play_to=play_to
114142
)
143+
144+
#Play multiple text prompts
145+
#play_source1 = TextSource(text="Hi, This is multiple play source one call media test.", voice_name=SPEECH_TO_TEXT_VOICE)
146+
#play_source2 = TextSource(text="Hi, This is multiple play source two call media test.", voice_name=SPEECH_TO_TEXT_VOICE)
147+
#
148+
#play_sources = [play_source1, play_source2]
149+
#
150+
#call_connection_client.play_media_to_all(
151+
# play_source=play_sources,
152+
# interrupt_call_media_operation=False,
153+
# operation_context="multiplePlayContext",
154+
# operation_callback_url=CALLBACK_EVENTS_URI,
155+
# loop=False
156+
#)
115157
```
116158

117159
### Play source - Text-To-Speech with SSML
@@ -173,6 +215,33 @@ call_automation_client.get_call_connection(call_connection_id).play_media(
173215
)
174216
```
175217

218+
### Support for barge-in
219+
During scenarios where you're playing audio on loop to all participants e.g. waiting lobby you maybe playing audio to the participants in the lobby and keep them updated on their number in the queue. When you use the barge-in support, this will cancel the on-going audio and play your new message. Then if you wanted to continue playing your original audio you would make another play request.
220+
221+
```python
222+
# Interrupt media with text source
223+
# Option 1
224+
play_source = TextSource(text="This is interrupt call media test.", voice_name=SPEECH_TO_TEXT_VOICE)
225+
call_connection_client.play_media_to_all(
226+
play_source,
227+
interrupt_call_media_operation=True,
228+
operation_context="interruptContext",
229+
operation_callback_url=CALLBACK_EVENTS_URI,
230+
loop=False
231+
)
232+
233+
# Interrupt media with file source
234+
# Option 2
235+
#play_source = FileSource(MAIN_MENU_PROMPT_URI)
236+
#call_connection_client.play_media_to_all(
237+
# play_source,
238+
# interrupt_call_media_operation=True,
239+
# operation_context="interruptContext",
240+
# operation_callback_url=CALLBACK_EVENTS_URI,
241+
# loop=False
242+
#)
243+
```
244+
176245
## Play audio - Specific participant
177246

178247
Play a prerecorded audio file to a specific participant in the call.
@@ -227,6 +296,14 @@ if event.type == "Microsoft.Communication.PlayCompleted":
227296
app.logger.info("Play completed, context=%s", event.data.get("operationContext"))
228297
```
229298

299+
### Example of how you can deserialize the *PlayStarted* event:
300+
301+
```python
302+
if event.type == "Microsoft.Communication.PlayStarted":
303+
304+
app.logger.info("Play started, context=%s", event.data.get("operationContext"))
305+
```
306+
230307
### Example of how you can deserialize the *PlayFailed* event:
231308

232309
```python

0 commit comments

Comments
 (0)