Skip to content

Commit f2ba566

Browse files
authored
Merge pull request #283904 from alvin-l-han/main
Adding public docs for Audio Streaming Public Preview Release
2 parents e9409a8 + 795646d commit f2ba566

File tree

14 files changed

+659
-367
lines changed

14 files changed

+659
-367
lines changed

articles/communication-services/.openpublishing.redirection.communication-services.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -440,6 +440,16 @@
440440
"source_path_from_root": "/articles/communication-services/quickstarts/media-composition/get-started-media-composition.md",
441441
"redirect_url": "/azure/communication-services/quickstarts/voice-video-calling/getting-started-with-calling",
442442
"redirect_document_id": false
443+
},
444+
{
445+
"source_path_from_root": "/articles/communication-services/quickstarts/voice-video-calling/media-streaming.md",
446+
"redirect_url": "/azure/communication-services/how-tos/call-automation/audio-streaming-quickstart",
447+
"redirect_document_id": false
448+
},
449+
{
450+
"source_path_from_root": "/articles/communication-services/concepts/voice-video-calling/media-streaming.md",
451+
"redirect_url": "/azure/communication-services/concepts/call-automation/audio-streaming-concept",
452+
"redirect_document_id": false
443453
}
444454
]
445455
}
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: Audio streaming overview
3+
titleSuffix: An Azure Communication Services concept document
4+
description: Conceptual information about using Audio Streaming APIs with Call Automation.
5+
author: Alvin
6+
ms.service: azure-communication-services
7+
ms.topic: overview
8+
ms.date: 07/17/2024
9+
ms.author: alvinhan
10+
ms.custom: public_prview
11+
---
12+
13+
# Audio streaming overview - audio subscription
14+
15+
[!INCLUDE [Public Preview Disclaimer](../../includes/public-preview-include-document.md)]
16+
17+
Azure Communication Services provides developers with Audio Streaming capabilities to get real-time access to audio streams to capture, analyze, and process audio content during active calls. In today's world consumption of live audio and video is prevalent, this content could be in the forms of online meetings, online conferences, customer support, etc. With audio streaming access, developers can now build server applications to capture and analyze audio streams for each of the participants on the call in real-time. Developers can also combine audio streaming with other call automation actions or use their own AI models to analyze audio streams. Use cases include NLP for conversation analysis or providing real-time insights and suggestions to agents while they are in an active interaction with end users.
18+
19+
This public preview supports the ability for developers to get access to real-time audio streams over a WebSocket to analyze the call's audio in mixed and unmixed formats.
20+
21+
## Common use cases
22+
Audio streams can be used in many ways. Some examples of how developers may wish to use the audio streams in their applications include:
23+
24+
### Real-time call assistance
25+
26+
**Improved AI powered suggestions** - Use real-time audio streams of active interactions between agents and customers to gauge the intent of the call and how your agents can provide a better experience to their customer through active suggestions using your own AI model to analyze the call.
27+
28+
### Authentication
29+
30+
**Biometric authentication** – Use the audio streams to carry out voice authentication, by running the audio from the call through your voice recognition/matching engine/tool.
31+
32+
## Sample architecture for subscribing to audio streams from an ongoing call - live agent scenario
33+
34+
[![Screenshot of architecture diagram for audio streaming.](./media/audio-streaming-diagram.png)](./media/audio-streaming-diagram.png#lightbox)
35+
36+
## Supported formats
37+
38+
### Mixed format
39+
Contains mixed audio of all participants on the call. All audio is flattened into one stream.
40+
41+
### Unmixed
42+
Contains audio per participant per channel, with support for up to four channels for the four most dominant speakers at any point in a call. You'll also get a participantRawID that you can use to determine the speaker.
43+
44+
## Additional information
45+
The table below describes information that will help developers convert the audio packets into audible content that can be used by their applications.
46+
- Framerate: 50 frames per second
47+
- Packet stream rate: 20 ms rate
48+
- Data packet: 64 Kbytes
49+
- Audio metric: 16-bit PCM mono at 16000 hz
50+
- Public string data is a base64 string that should be converted into a byte array to create raw PCM file.
51+
52+
## Next Steps
53+
Check out the [audio streaming quickstart](../../how-tos/call-automation/audio-streaming-quickstart.md) to learn more.
98.9 KB
Loading

articles/communication-services/concepts/voice-video-calling/media-streaming.md

Lines changed: 0 additions & 57 deletions
This file was deleted.
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
---
2+
title: Audio streaming quickstart
3+
titleSuffix: An Azure Communication Services quickstart document
4+
description: Provides a quick start for developers to get audio streams through audio streaming APIs from Azure Communication Services calls.
5+
author: alvin
6+
ms.service: azure-communication-services
7+
ms.topic: include
8+
ms.date: 7/15/2024
9+
ms.author: alvinhan
10+
ms.custom:
11+
services: azure-communication-services
12+
zone_pivot_groups: acs-js-csharp-java-python
13+
---
14+
15+
# Quickstart: Server-side Audio Streaming
16+
17+
[!INCLUDE [Public Preview Disclaimer](../../includes/public-preview-include-document.md)]
18+
19+
Get started with using audio streams through Azure Communication Services Audio Streaming API. This quickstart assumes you're already familiar with Call Automation APIs to build an automated call routing solution.
20+
21+
Functionality described in this quickstart is currently in public preview.
22+
23+
::: zone pivot="programming-language-csharp"
24+
[!INCLUDE [Audio Streaming with .NET](./includes//audio-streaming-quickstart-csharp.md)]
25+
::: zone-end
26+
27+
::: zone pivot="programming-language-java"
28+
[!INCLUDE [Audio Streaming with Java](./includes/audio-streaming-quickstart-java.md)]
29+
::: zone-end
30+
31+
::: zone pivot="programming-language-javascript"
32+
[!INCLUDE [Audio Streaming with JavaScript](./includes/audio-streaming-quickstart-js.md)]
33+
::: zone-end
34+
35+
::: zone pivot="programming-language-python"
36+
[!INCLUDE [Audio Streaming with Python](./includes/audio-streaming-quickstart-python.md)]
37+
::: zone-end
38+
39+
40+
## Audio streaming schema
41+
After sending through the metadata packet, Azure Communication Services will start streaming audio media to your WebSocket server. Below is an example of what the media object your server will receive looks like.
42+
43+
``` code
44+
{
45+
"kind": <string>, // What kind of data this is, e.g. AudioMetadata, AudioData.
46+
"audioData":{
47+
"data": <string>, // Base64 Encoded audio buffer data
48+
"timestamp": <string>, // In ISO 8601 format (yyyy-mm-ddThh:mm:ssZ)
49+
"participantRawID": <string>,
50+
"silent": <boolean> // Indicates if the received audio buffer contains only silence.
51+
}
52+
}
53+
```
54+
55+
56+
## Clean up resources
57+
58+
If you want to clean up and remove a Communication Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it. Learn more about [cleaning up resources](../../quickstarts/create-communication-resource.md#clean-up-resources).
59+
60+
## Next steps
61+
- Learn more about [Audio Streaming](../../concepts/call-automation/audio-streaming-concept.md).
62+
- Learn more about [Call Automation](../../concepts/call-automation/call-automation.md) and its features.
63+
- Learn more about [Play action](../../concepts/call-automation/play-action.md).
64+
- Learn more about [Recognize action](../../concepts/call-automation/recognize-action.md).
Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
---
2+
title: Include file - C#
3+
description: C# Audio Streaming quickstart
4+
services: azure-communication-services
5+
author: Alvin
6+
ms.service: azure-communication-services
7+
ms.subservice: call-automation
8+
ms.date: 07/15/2024
9+
ms.topic: include
10+
ms.topic: Include file
11+
ms.author: alvinhan
12+
---
13+
14+
## Prerequisites
15+
- An Azure account with an active subscription, for details see [Create an account for free.](https://azure.microsoft.com/free/)
16+
- An Azure Communication Services resource. See [Create an Azure Communication Services resource](../../../quickstarts/create-communication-resource.md?tabs=windows&pivots=platform-azp).
17+
- A new web service application created using the [Call Automation SDK](../../../quickstarts/call-automation/callflows-for-customer-interactions.md).
18+
- The latest [.NET library](https://dotnet.microsoft.com/download/dotnet-core) for your operating system.
19+
- A websocket server that can receive media streams.
20+
21+
## Set up a websocket server
22+
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection.
23+
You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this [quickstart](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/).
24+
25+
## Establish a call
26+
Establish a call and provide streaming details
27+
28+
``` C#
29+
MediaStreamingOptions mediaStreamingOptions = new MediaStreamingOptions(
30+
new Uri("<WEBSOCKET URL>"),
31+
MediaStreamingContent.Audio,
32+
MediaStreamingAudioChannel.Mixed,
33+
MediaStreamingTransport.Websocket,
34+
false);
35+
36+
var createCallOptions = new CreateCallOptions(callInvite, callbackUri)
37+
{
38+
CallIntelligenceOptions = new CallIntelligenceOptions() { CognitiveServicesEndpoint = new Uri(cognitiveServiceEndpoint) },
39+
MediaStreamingOptions = mediaStreamingOptions,
40+
};
41+
42+
CreateCallResult createCallResult = await callAutomationClient.CreateCallAsync(createCallOptions);
43+
```
44+
45+
## Start audio streaming
46+
How to start audio streaming:
47+
``` C#
48+
StartMediaStreamingOptions options = new StartMediaStreamingOptions()
49+
{
50+
OperationCallbackUri = new Uri(callbackUriHost),
51+
OperationContext = "startMediaStreamingContext"
52+
};
53+
await callMedia.StartMediaStreamingAsync(options);
54+
```
55+
When Azure Communication Services receives the URL for your WebSocket server, it creates a connection to it. Once Azure Communication Services successfully connects to your WebSocket server and streaming is started, it will send through the first data packet, which contains metadata about the incoming media packets.
56+
57+
The metadata packet will look like this:
58+
``` code
59+
{
60+
"kind": <string> // What kind of data this is, e.g. AudioMetadata, AudioData.
61+
"audioMetadata": {
62+
"subscriptionId": <string>, // unique identifier for a subscription request
63+
"encoding":<string>, // PCM only supported
64+
"sampleRate": <int>, // 16000 default
65+
"channels": <int>, // 1 default
66+
"length": <int> // 640 default
67+
}
68+
}
69+
```
70+
71+
72+
## Stop audio streaming
73+
How to stop audio streaming
74+
``` C#
75+
StopMediaStreamingOptions stopOptions = new StopMediaStreamingOptions()
76+
{
77+
OperationCallbackUri = new Uri(callbackUriHost)
78+
};
79+
await callMedia.StopMediaStreamingAsync(stopOptions);
80+
```
81+
82+
## Handling audio streams in your websocket server
83+
The sample below demonstrates how to listen to audio streams using your websocket server.
84+
85+
``` C#
86+
HttpListener httpListener = new HttpListener();
87+
httpListener.Prefixes.Add("http://localhost:80/");
88+
httpListener.Start();
89+
90+
while (true)
91+
{
92+
HttpListenerContext httpListenerContext = await httpListener.GetContextAsync();
93+
if (httpListenerContext.Request.IsWebSocketRequest)
94+
{
95+
WebSocketContext websocketContext;
96+
try
97+
{
98+
websocketContext = await httpListenerContext.AcceptWebSocketAsync(subProtocol: null);
99+
}
100+
catch (Exception ex)
101+
{
102+
return;
103+
}
104+
WebSocket webSocket = websocketContext.WebSocket;
105+
try
106+
{
107+
while (webSocket.State == WebSocketState.Open || webSocket.State == WebSocketState.CloseSent)
108+
{
109+
byte[] receiveBuffer = new byte[2048];
110+
var cancellationToken = new CancellationTokenSource(TimeSpan.FromSeconds(60)).Token;
111+
WebSocketReceiveResult receiveResult = await webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), cancellationToken);
112+
if (receiveResult.MessageType != WebSocketMessageType.Close)
113+
{
114+
var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0');
115+
try
116+
{
117+
var eventData = JsonConvert.DeserializeObject<AudioBaseClass>(data);
118+
if (eventData != null)
119+
{
120+
if(eventData.kind == "AudioMetadata")
121+
{
122+
//Process audio metadata
123+
}
124+
else if(eventData.kind == "AudioData")
125+
{
126+
//Process audio data
127+
var byteArray = eventData.audioData.data;
128+
//use audio byteArray as you want
129+
}
130+
}
131+
}
132+
catch { }
133+
}
134+
}
135+
}
136+
catch (Exception ex) { }
137+
}
138+
}
139+
```

0 commit comments

Comments
 (0)