Skip to content

Commit 6a3418b

Browse files
authored
Merge pull request #216287 from valindrae/media-streaming
Media streaming
2 parents 9ec3e4f + 8226f04 commit 6a3418b

File tree

5 files changed

+393
-0
lines changed

5 files changed

+393
-0
lines changed
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
title: Media streaming overview
3+
titleSuffix: An Azure Communication Services concept document
4+
description: Conceptual information about using Media Streaming APIs with Call Automation.
5+
author: Kunaal
6+
ms.service: azure-communication-services
7+
ms.topic: include
8+
ms.date: 10/25/2022
9+
ms.author: kpunjabi
10+
ms.custom: private_preview
11+
---
12+
13+
# Media streaming overview - audio subscription
14+
15+
> [!IMPORTANT]
16+
> Functionality described on this document is currently in private preview. Private preview includes access to SDKs and documentation for testing purposes that are not yet available publicly.
17+
> Apply to become an early adopter by filling out the form for [preview access to Azure Communication Services](https://aka.ms/ACS-EarlyAdopter).
18+
19+
Azure Communication Services provides developers with Media Streaming capabilities to get real-time access to media streams to capture, analyze and process audio content during active calls. In today's world consumption of live audio and video is prevalent, this content could be in the forms of online meetings, online conferences, online schooling, customer support, etc. This consumption has only been exacerbated by the recent events of Covid-19, with many of the worlds work force working remotely from home. With media streaming access, developers can now build server applications to capture and analyze audio streams for each of the participants on the call in real-time. Developers can also combine media streaming with other call automation actions or use their own AI models to analyze audio streams for use cases such as NLP for conversation analysis or provide real-time insights and suggestions to their agents while they are in an active interaction with their end users.
20+
21+
This private preview supports the ability for developers to get access to real-time audio streams over a websocket to analyze each participants audio in mixed and unmixed formats
22+
23+
## Common use cases
24+
Audio streams can be used in many ways, below are some examples of how developers may wish to use the audio streams in their applications.
25+
26+
### Real-time call assistance
27+
28+
**Improved AI powered suggestions** - Use real-time audio streams of active interactions between agents and customers to gauge the intent of the call and how your agents can provide a better experience to their customer through active suggestions using your own AI model to analyze the call.
29+
30+
### Authentication
31+
**Biometric authentication** – Use the audio streams to carry out authentication using caller biometrics such as voice recognition.
32+
33+
### Interpretations
34+
**Real-time translation** – Use audio streams to send to human or AI translators who can consume this audio content and provide translations.
35+
36+
## Sample architecture for subscribing to audio streams from an ongoing call
37+
38+
[![Screenshot of flow for play action.](./media/media-streaming-flow.png)](./media/media-streaming-flow.png#lightbox)
39+
40+
## Supported formats
41+
42+
### Mixed format
43+
Contains mixed audio of all participants on the call.
44+
45+
### Unmixed
46+
Contains audio per participant per channel, with support for up to four channels for four dominant speakers. You will also get a participantRawID that you can use to determine the speaker.
47+
48+
## Additional information
49+
The table below describes information that will help developers convert the media packets into audible content that can be used by their applications.
50+
- Framerate: 50 frames per second
51+
- Packet stream rate: 20 ms rate
52+
- Data packet: 64 Kbytes
53+
- Audio metric: 16-bit PCM mono at 16000 hz
54+
- Public string data is a base64 string that should be converted into a byte array to create raw PCM file. You can then use the following configuration in Audacity to run the file.
55+
56+
## Next Steps
57+
Check out the [Media Streaming quickstart](../../quickstarts/voice-video-calling/media-streaming.md) to learn more.
98.9 KB
Loading
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
---
2+
title: include file
3+
description: C# Media Streaming quickstart
4+
services: azure-communication-services
5+
author: Kunaal
6+
ms.service: azure-communication-services
7+
ms.subservice: azure-communication-services
8+
ms.date: 10/25/2022
9+
ms.topic: include
10+
ms.topic: include file
11+
ms.author: kpunjabi
12+
---
13+
14+
## Prerequisites
15+
- Azure account with an active subscription, for details see [Create an account for free.](https://azure.microsoft.com/free/)
16+
- Azure Communication Services resource. See [Create an Azure Communication Services resource](../../../create-communication-resource.md?tabs=windows&pivots=platform-azp)
17+
- Create a new web service application using the [Call Automation SDK](../../Callflows-for-customer-interactions.md).
18+
- The latest [.NET library](https://dotnet.microsoft.com/download/dotnet-core) for your operating system.
19+
- [Apache Maven](https://maven.apache.org/download.cgi).
20+
- A websocket server that can receive media streams.
21+
22+
## Set up a websocket server
23+
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection.
24+
You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this [quickstart](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/).
25+
26+
## Establish a call
27+
In this quickstart we assume that you're already familiar with starting calls. If you need to learn more about starting and establishing calls, you can follow our [quickstart](../../callflows-for-customer-interactions.md). For the purposes of this quickstart, we'll be going through the process of starting media streaming for both incoming calls and outbound calls.
28+
29+
## Start media streaming - incoming call
30+
Your application will start receiving media streams once you answer the call and provide ACS with the WebSocket information.
31+
32+
``` csharp
33+
var mediaStreamingOptions = new MediaStreamingOptions(
34+
new Uri("wss://testwebsocket.webpubsub.azure.com/client/hubs/media?accesstoken={access_token}"),
35+
MediaStreamingTransport.WebSocket,
36+
MediaStreamingContent.Audio,
37+
MediaStreamingAudioChannel.Mixed,
38+
);
39+
var answerCallOptions = new AnswerCallOptions(incomingCallContext, callbackUri: new Uri(callConfiguration.AppCallbackUrl)) {
40+
MediaStreamingOptions = mediaStreamingOptions
41+
};
42+
var response = await callingServerClient.AnswerCallAsync(answerCallOptions);
43+
```
44+
45+
## Start media streaming - outbound call
46+
Your application will start receiving media streams once you create the call and provide ACS with the WebSocket information.
47+
48+
``` csharp
49+
var mediaStreamingOptions = new MediaStreamingOptions(
50+
new Uri("wss://{yourwebsocketurl}"),
51+
MediaStreamingTransport.WebSocket,
52+
MediaStreamingContent.Audio,
53+
MediaStreamingAudioChannel.Mixed,
54+
);
55+
var createCallOptions = new CreateCallOptions(callSource, new List < PhoneNumberIdentifier > {
56+
target
57+
}, new Uri(callConfiguration.AppCallbackUrl)) {
58+
MediaStreamingOptions = mediaStreamingOptions
59+
};
60+
var createCallResult = await client.CreateCallAsync(createCallOptions);
61+
```
62+
## Handling media streams in your websocket server
63+
The sample below demonstrates how to listen to media stream using your websocket server
64+
65+
``` csharp
66+
HttpListener httpListener = new HttpListener();
67+
httpListener.Prefixes.Add("http://localhost:80/");
68+
httpListener.Start();
69+
while (true) {
70+
HttpListenerContext httpListenerContext = await httpListener.GetContextAsync();
71+
if (httpListenerContext.Request.IsWebSocketRequest) {
72+
WebSocketContext websocketContext;
73+
try {
74+
websocketContext = await httpListenerContext.AcceptWebSocketAsync(subProtocol: null);
75+
string ipAddress = httpListenerContext.Request.RemoteEndPoint.Address.ToString();
76+
} catch (Exception ex) {
77+
httpListenerContext.Response.StatusCode = 500;
78+
httpListenerContext.Response.Close();
79+
return;
80+
}
81+
WebSocket webSocket = websocketContext.WebSocket;
82+
try {
83+
while (webSocket.State == WebSocketState.Open || webSocket.State == WebSocketState.CloseSent) {
84+
byte[] receiveBuffer = new byte[2048];
85+
var cancellationToken = new CancellationTokenSource(TimeSpan.FromSeconds(60)).Token;
86+
WebSocketReceiveResult receiveResult = await webSocket.ReceiveAsync(new ArraySegment < byte >. (receiveBuffer), cancellationToken);
87+
if (receiveResult.MessageType != WebSocketMessageType.Close) {
88+
var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0');
89+
try {
90+
var json = JsonConvert.DeserializeObject < Audio > (data);
91+
if (json != null) {
92+
var byteArray = json.AudioData;
93+
//Processing mixed audio data
94+
if (string.IsNullOrEmpty(json?.ParticipantId)) {
95+
if (string.IsNullOrEmpty(WebSocketData.FirstReceivedMixedAudioBufferTimeStamp)) {
96+
WebSocketData.FirstReceivedMixedAudioBufferTimeStamp = json.Timestamp;
97+
}
98+
//Process byteArray ( audioData ) however you want
99+
}
100+
}
101+
102+
//Processing unmixed audio data
103+
else if (!string.IsNullOrEmpty(json?.ParticipantId) && !json.IsSilence) {
104+
if (json.ParticipantId != null) {
105+
switch (json.ParticipantId) {
106+
case {
107+
participantRawId1
108+
}:
109+
//Process audio data
110+
break;
111+
case {
112+
participantRawId2
113+
}::
114+
//Process audio data
115+
break;
116+
default:
117+
break;
118+
}
119+
}
120+
if (string.IsNullOrEmpty(WebSocketData.FirstReceivedUnmixedAudioBufferTimeStamp)) {
121+
WebSocketData.FirstReceivedUnmixedAudioBufferTimeStamp = json.Timestamp;
122+
}
123+
}
124+
} catch {}
125+
}
126+
}
127+
} catch (Exception ex) {}
128+
} else {
129+
httpListenerContext.Response.StatusCode = 400;
130+
httpListenerContext.Response.Close();
131+
}
132+
}
133+
```
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
---
2+
title: include file
3+
description: Java Media Streaming quickstart
4+
services: azure-communication-services
5+
author: Kunaal
6+
ms.service: azure-communication-services
7+
ms.subservice: azure-communication-services
8+
ms.date: 09/06/2022
9+
ms.topic: include
10+
ms.topic: include file
11+
ms.author: kpunjabi
12+
---
13+
14+
## Prerequisites
15+
16+
- Azure account with an active subscription, for details see [Create an account for free.](https://azure.microsoft.com/free/)
17+
- Azure Communication Services resource. See [Create an Azure Communication Services resource](../../../create-communication-resource.md?tabs=windows&pivots=platform-azp)
18+
- Create a new web service application using the [Call Automation SDK](../../Callflows-for-customer-interactions.md).
19+
- [Java Development Kit](/java/azure/jdk/?preserve-view=true&view=azure-java-stable) version 8 or above.
20+
- [Apache Maven](https://maven.apache.org/download.cgi).
21+
22+
## Set up a websocket server
23+
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection.
24+
You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this [quickstart](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/).
25+
26+
## Establish a call
27+
In this quickstart we assume that you're already familiar with starting calls. If you need to learn more about starting and establishing calls, you can follow our [quickstart](../../callflows-for-customer-interactions.md). For the purposes of this quickstart, we'll be going through the process of starting media streaming for both incoming calls and outbound calls.
28+
29+
## Start media streaming - incoming call
30+
Your application will start receiving media streams once you answer the call and provide ACS with the WebSocket information.
31+
32+
``` java
33+
var mediaStreamingOptions = new MediaStreamingOptions(
34+
"wss://{yourwebsocketurl}",
35+
MediaStreamingTransport.WebSocket,
36+
MediaStreamingContent.Audio,
37+
MediaStreamingAudioChannel.Mixed
38+
);
39+
var answerCallOptions = new AnswerCallOptions(“<incomingCallContext>”, callConfiguration.AppCallbackUrl).setMediaStreamingConfiguration(mediaStreamingOptions);
40+
41+
var answerCallResponse = callAutomationAsyncClient.answerCallWithResponse(answerCallOptions).block();
42+
```
43+
44+
## Start media streaming - outbound call
45+
Your application will start receiving media streams once you create the call and provide ACS with the WebSocket information.
46+
47+
``` java
48+
var mediaStreamingOptions = new MediaStreamingOptions(
49+
"wss://{yourwebsocketurl}",
50+
MediaStreamingTransportType.WebSocket,
51+
MediaStreamingContentType.Audio,
52+
MediaStreamingAudioChannelType.Mixed
53+
);
54+
var createCallOptions = new CreateCallOptions(
55+
callSource,
56+
Collections.singletonList(target),
57+
callConfiguration.AppCallbackUrl
58+
);
59+
createCallOptions.setMediaStreamingConfiguration(mediaStreamingOptions);
60+
var answerCallResponse = callAutomationAsyncClient.createCallWithResponse(
61+
createCallOptions
62+
).block();
63+
```
64+
65+
## Handling media streams in your websocket server
66+
The sample below demonstrates how to listen to media stream using your websocket server.
67+
68+
``` java
69+
public class WebsocketServer {
70+
public static void main(String[] args) throws IOException {
71+
Socket socket = null;
72+
InputStreamReader inputStreamReader = null;
73+
OutputStreamWriter outputStreamWriter = null;
74+
BufferedReader bufferedReader = null;
75+
BufferedWriter bufferedWriter = null;
76+
ServerSocket serverSocket = null;
77+
serverSocket = new ServerSocket(1234);
78+
while (true) {
79+
try {
80+
socket = serverSocket.accept();
81+
inputStreamReader = new InputStreamReader(socket.getInputStream());
82+
outputStreamWriter = new OutputStreamWriter(socket.getOutputStream());
83+
bufferedReader = new BufferedReader(inputStreamReader);
84+
bufferedWriter = new BufferedWriter(outputStreamWriter);
85+
while (!socket.isClosed()) {
86+
String msgFromClient = bufferedReader.readLine();
87+
//You can process the message however you want
88+
System.out.println("Client:" + msgFromClient);
89+
bufferedWriter.write("MSG Received");
90+
bufferedWriter.newLine();
91+
bufferedWriter.flush();
92+
}
93+
socket.close();
94+
inputStreamReader.close();
95+
outputStreamWriter.close();
96+
bufferedWriter.close();
97+
bufferedReader.close();
98+
} catch (IOException e) {
99+
throw new RuntimeException(e);
100+
}
101+
}
102+
}
103+
}
104+
```

0 commit comments

Comments
 (0)