MicrosoftDocs
diff --git a/‎articles/communication-services/concepts/call-automation/media/transcription.png
154 KB b/‎articles/communication-services/concepts/call-automation/media/transcription.png
154 KB
diff --git a/‎articles/communication-services/concepts/call-automation/real-time-transcription.md
Lines changed: 43 additions & 0 deletions b/‎articles/communication-services/concepts/call-automation/real-time-transcription.md
Lines changed: 43 additions & 0 deletions
diff --git a/‎articles/communication-services/how-tos/call-automation/control-mid-call-media-actions.md
Lines changed: 132 additions & 1 deletion b/‎articles/communication-services/how-tos/call-automation/control-mid-call-media-actions.md
Lines changed: 132 additions & 1 deletion
diff --git a/‎articles/communication-services/how-tos/call-automation/includes/play-audio-how-to-js.md
Lines changed: 46 additions & 1 deletion b/‎articles/communication-services/how-tos/call-automation/includes/play-audio-how-to-js.md
Lines changed: 46 additions & 1 deletion
diff --git a/‎articles/communication-services/how-tos/call-automation/includes/play-audio-how-to-python.md
Lines changed: 78 additions & 1 deletion b/‎articles/communication-services/how-tos/call-automation/includes/play-audio-how-to-python.md
Lines changed: 78 additions & 1 deletion
@@ -0,0 +1,43 @@
+---
+title: Generate real-time transcripts
+titleSuffix: An Azure Communication Services concept document
+description: Provides an overview of what real-time transcription is
+author: kunaal
+ms.service: azure-communication-services
+ms.subservice: call-automation
+ms.topic: include
+ms.date: 07/16/2024
+ms.author: kpunjabi
+services: azure-communication-services
+---
+
+# Generating real-time transcripts
+[!INCLUDE [Public Preview Disclaimer](../../includes/public-preview-include-document.md)]
+
+Real-time transcriptions are a crucial component in any major business for driving improved customer service experience. Powered by Azure Communication Services and Azure AI Services integration, developers can now use real-time transcriptions through Call Automation SDKs. 
+
+Using the Azure Communication Services real-time transcription, you can easily integrate your Azure AI Services resource with Azure Communication Services to generate transcripts directly during the call. This eliminates the need for developers to extract audio content and deal with the overhead of converting audio into text on your side. You can store the contents of this transcript to use later on for creating a history of the call, summarizing the call to save an agent's time, and even feeding it into your training/learning modules to help improve your contact center agents' customer interactions.
+
+Out of the box Microsoft utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. This model is pretrained with dialects and phonetics representing various common domains. For more information about supported languages, see [Languages and voice support for the Speech service](../../../../articles/ai-services/speech-service/language-support.md).
+
+## Common use cases
+
+### Improved customer experience
+Assist agents better understand customer needs and respond more quickly and accurately, leading to a better overall customer experience.
+
+### Increased efficency 
+Help agents focus on the conversation rather than note-taking, allowing them to handle more calls and improve productivity
+
+### Context for agents
+Provide context to an agent before the agent picks up the call, this way the agent knows the information that the caller has provided avoiding any need for the caller to repeat their issue.
+
+### Derive insights
+Using the transcript generated throughout the call, you can leverage other AI tools to gain live, real-time insights that will help agents and supervisors improve their interactions with customers.
+
+## Sample flow of real-time transcription using Call Automation
+![Diagram of real-time transcription flow.](./media/transcription.png)
+
+## Next Steps
+- Check out our how-to guide to learn [how-to use our Real-time Transcription](../../how-tos/call-automation/real-time-transcription-tutorial.md) to users.
+- Learn about [usage and operational logs](../analytics/logs/call-automation-logs.md) published by call automation.
+
@@ -6,7 +6,7 @@ author: kunaal
 ms.topic: how-to
 ms.service: azure-communication-services
 ms.subservice: call-automation
-ms.date: 11/16/2023
+ms.date: 07/16/2024
 ms.author: kpunjabi
 ms.custom: public_preview
 services: azure-communication-services
@@ -356,3 +356,134 @@ if event.type == "Microsoft.Communication.ContinuousDtmfRecognitionStopped":
     app.logger.info("Tone stoped: context=%s", event.data["operationContext"])
 ```
 -----
+
+### Hold
+The hold action allows developers to temporarily pause a conversation between a participant and a system or agent. This can be useful in scenarios where the participant needs to be transferred to another agent or department or when the agent needs to consult a supervisor in the background before continuing the conversation. During this time you can choose to play audio to the participant that is on hold. 
+
+### [csharp](#tab/csharp)
+```csharp
+// Option 1: Hold without additional options
+await callAutomationClient.GetCallConnection(callConnectionId)
+    .GetCallMedia().HoldAsync(c2Target);
+
+/*
+// Option 2: Hold with play source
+PlaySource playSource = /* initialize playSource */;
+await callAutomationClient.GetCallConnection(callConnectionId)
+    .GetCallMedia().HoldAsync(c2Target, playSource);
+
+// Option 3: Hold with options
+var holdOptions = new HoldOptions(target) 
+{ 
+    OperationCallbackUri = new Uri(""),
+    OperationContext = "holdcontext"
+};
+await callMedia.HoldAsync(holdOptions);
+*/
+```
+
+### [java](#tab/java)
+```java
+// Option 1: Hold with options
+PlaySource playSource = /* initialize playSource */;
+HoldOptions holdOptions = new HoldOptions(target)
+    .setOperationCallbackUrl(appConfig.getBasecallbackuri())
+    .setPlaySource(playSource)
+    .setOperationContext("holdPstnParticipant");
+
+client.getCallConnection(callConnectionId).getCallMedia().holdWithResponse(holdOptions, Context.NONE);
+
+/*
+// Option 2: Hold without additional options
+client.getCallConnection(callConnectionId).getCallMedia().hold(target);
+*/
+```
+
+### [JavaScript](#tab/javascript)
+```javascript
+// Option 1: Hold with options
+const options = {
+    playSource: playSource,
+    operationContext: "holdUserContext",
+    operationCallbackUrl: "URL" // replace with actual callback URL
+};
+await callMedia.hold(targetuser, options);
+
+/*
+// Option 2: Hold without additional options
+await callMedia.hold(targetuser);
+*/
+```
+
+### [Python](#tab/python)
+```python
+# Option 1: Hold without additional options
+call_connection_client.hold(target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER))
+
+'''
+# Option 2: Hold with options
+call_connection_client.hold(
+    target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER),
+    play_source=play_source,
+    operation_context="holdUserContext",
+    operation_callback_url="URL" # replace with actual callback URL
+)
+'''
+```
+-----
+### Unhold
+The unhold action allows developers to resume a conversation between a participant and a system or agent that was previously paused. When the participant is taken off hold they will be able to hear the system or agent again. 
+
+### [csharp](#tab/csharp)
+``` csharp
+var unHoldOptions = new UnholdOptions(target) 
+{ 
+    OperationContext = "UnHoldPstnParticipant" 
+}; 
+
+// Option 1
+var UnHoldParticipant = await callMedia.UnholdAsync(unHoldOptions);
+
+/* 
+// Option 2
+var UnHoldParticipant = await callMedia.UnholdAsync(target);
+*/
+```
+
+### [java](#tab/java)
+``` java
+// Option 1
+client.getCallConnection(callConnectionId).getCallMedia().unholdWithResponse(target, "unholdPstnParticipant", Context.NONE);
+
+/* 
+// Option 2
+client.getCallConnection(callConnectionId).getCallMedia().unhold(target);
+*/
+```
+
+### [JavaScript](#tab/javascript)
+```javascript
+const unholdOptions = { 
+    operationContext: "unholdUserContext" 
+}; 
+
+// Option 1
+await callMedia.unhold(target);
+
+/* 
+// Option 2
+await callMedia.unhold(target, unholdOptions);
+*/
+```
+
+### [Python](#tab/python)
+```python
+# Option 1
+call_connection_client.unhold(target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER)) 
+
+'''
+# Option 2
+call_connection_client.unhold(target_participant=PhoneNumberIdentifier(TARGET_PHONE_NUMBER), operation_context="holdUserContext") 
+'''
+```
+-----
@@ -124,6 +124,45 @@ await callAutomationClient.getCallConnection(callConnectionId)
     .playToAll([ playSource ]);
 ```
 
+### Support for barge-in
+During scenarios where you're playing audio on loop to all participants e.g. waiting lobby you maybe playing audio to the participants in the lobby and keep them updated on their number in the queue. When you use the barge-in support, this will cancel the on-going audio and play your new message. Then if you wanted to continue playing your original audio you would make another play request.
+
+```javascript
+// Interrupt media with text source 
+//Option1:
+
+const playSource: TextSource = { text: "Interrupt prompt", voiceName: "en-US-NancyNeural", kind: "textSource" };
+
+const interruptOption: PlayToAllOptions = { 
+loop: false, 
+interruptCallMediaOperation: true, 
+operationContext: "interruptOperationContext", 
+operationCallbackUrl: process.env.CALLBACK_URI + "/api/callbacks" 
+}; 
+
+await callConnectionMedia.playToAll([playSource], interruptOption); 
+
+/*
+// Interrupt media with file source 
+
+Option2: 
+
+const playSource: FileSource = { 
+url: MEDIA_URI + "MainMenu.wav", 
+kind: "fileSource" 
+}; 
+
+const interruptOption: PlayToAllOptions = { 
+loop: false, 
+interruptCallMediaOperation: true, 
+operationContext: "interruptOperationContext", 
+operationCallbackUrl: process.env.CALLBACK_URI + "/api/callbacks" 
+}; 
+
+await callConnectionMedia.playToAll([playSource], interruptOption); 
+*/
+```
+
 ## Play audio - Specific participant
 
 In this scenario, audio is played to a specific participant.
@@ -155,7 +194,7 @@ If you're playing the same audio file multiple times, your application can provi
 const playSource: FileSource = { url: audioUri, playsourcacheid: "<playSourceId>", kind: "fileSource" }; 
 await callAutomationClient.getCallConnection(callConnectionId) 
 .getCallMedia() 
-.play([ playSource ], [ targetParticipant ]); 
+.play([ playSource ], [ targetParticipant ]);
 ```
 
 ## Handle play action event updates 
@@ -177,6 +216,12 @@ if (event.type === "Microsoft.Communication.PlayFailed") {
     console.log("Play failed: data=%s", JSON.stringify(eventData)); 
 } 
 ```
+### Example of how you can deserialize the *PlayStarted* event:
+```javascript
+if (event.type === "Microsoft.Communication.PlayStarted") { 
+    console.log("Play started: data=%s", JSON.stringify(eventData)); 
+} 
+```
 
 To learn more about other supported events, visit the [Call Automation overview document](../../../concepts/call-automation/call-automation.md#call-automation-webhook-events).
 
 
@@ -84,6 +84,20 @@ To play audio to participants using audio files, you need to make sure the audio
 
 ``` python
 play_source = FileSource(url=audioUri)
+
+#Play multiple audio files
+#file_source1 = FileSource(MAIN_MENU_PROMPT_URI) 
+#file_source2 = FileSource(MAIN_MENU_PROMPT_URI) 
+#
+# play_sources = [file_source1, file_source2]
+# 
+# call_connection_client.play_media_to_all(
+#     play_source=play_sources,
+#     interrupt_call_media_operation=False,
+#     operation_context="multiplePlayContext",
+#     operation_callback_url=CALLBACK_EVENTS_URI,
+#     loop=False
+# )
 ```
 
 ### Play source - Text-To-Speech 
@@ -100,7 +114,21 @@ play_source = TextSource(
 play_to = [target_participant]
 call_automation_client.get_call_connection(call_connection_id).play_media(
     play_source=play_source, play_to=play_to
-) 
+)
+
+#Multiple text prompts
+#play_source1 = TextSource(text="Hi, This is multiple play source one call media test.", source_locale="en-US", voice_kind=VoiceKind.FEMALE) 
+#play_source2 = TextSource(text="Hi, This is multiple play source two call media test.", source_locale="en-US", voice_kind=VoiceKind.FEMALE)
+#
+#play_sources = [play_source1, play_source2]
+#
+#call_connection_client.play_media_to_all(
+#    play_source=play_sources,
+#    interrupt_call_media_operation=False,
+#    operation_context="multiplePlayContext",
+#    operation_callback_url=CALLBACK_EVENTS_URI,
+#    loop=False
+#)
 ```
 
 ``` python
@@ -112,6 +140,20 @@ play_to = [target_participant]
 call_automation_client.get_call_connection(call_connection_id).play_media(
     play_source=play_source, play_to=play_to
 )
+
+#Play multiple text prompts
+#play_source1 = TextSource(text="Hi, This is multiple play source one call media test.", voice_name=SPEECH_TO_TEXT_VOICE) 
+#play_source2 = TextSource(text="Hi, This is multiple play source two call media test.", voice_name=SPEECH_TO_TEXT_VOICE)
+#
+#play_sources = [play_source1, play_source2]
+#
+#call_connection_client.play_media_to_all(
+#    play_source=play_sources,
+#    interrupt_call_media_operation=False,
+#    operation_context="multiplePlayContext",
+#    operation_callback_url=CALLBACK_EVENTS_URI,
+#    loop=False
+#)
 ```
 
 ### Play source - Text-To-Speech with SSML 
@@ -173,6 +215,33 @@ call_automation_client.get_call_connection(call_connection_id).play_media(
 )
 ```
 
+### Support for barge-in
+During scenarios where you're playing audio on loop to all participants e.g. waiting lobby you maybe playing audio to the participants in the lobby and keep them updated on their number in the queue. When you use the barge-in support, this will cancel the on-going audio and play your new message. Then if you wanted to continue playing your original audio you would make another play request.
+
+```python
+# Interrupt media with text source
+# Option 1
+play_source = TextSource(text="This is interrupt call media test.", voice_name=SPEECH_TO_TEXT_VOICE)
+call_connection_client.play_media_to_all(
+    play_source, 
+    interrupt_call_media_operation=True, 
+    operation_context="interruptContext", 
+    operation_callback_url=CALLBACK_EVENTS_URI, 
+    loop=False
+)
+
+# Interrupt media with file source
+# Option 2
+#play_source = FileSource(MAIN_MENU_PROMPT_URI)
+#call_connection_client.play_media_to_all(
+#    play_source, 
+#    interrupt_call_media_operation=True, 
+#    operation_context="interruptContext", 
+#    operation_callback_url=CALLBACK_EVENTS_URI, 
+#    loop=False
+#)
+```
+
 ## Play audio - Specific participant
 
 Play a prerecorded audio file to a specific participant in the call.
@@ -227,6 +296,14 @@ if event.type == "Microsoft.Communication.PlayCompleted":
     app.logger.info("Play completed, context=%s", event.data.get("operationContext"))
 ```
 
+### Example of how you can deserialize the *PlayStarted* event:
+
+```python 
+if event.type == "Microsoft.Communication.PlayStarted":
+
+    app.logger.info("Play started, context=%s", event.data.get("operationContext"))
+```
+
 ### Example of how you can deserialize the *PlayFailed* event:
 
 ```python