Skip to content

Commit 4a6969a

Browse files
authored
Update audio-streaming-quickstart-csharp.md
Update outbound audio streaming snippet and add incoming streaming code snippets
1 parent 51efd99 commit 4a6969a

File tree

1 file changed

+179
-100
lines changed

1 file changed

+179
-100
lines changed
Lines changed: 179 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
22
title: Include file - C#
3-
description: C# Audio Streaming quickstart
3+
description: C# Bidirectional audio streaming how-to
44
services: azure-communication-services
55
author: Alvin
66
ms.service: azure-communication-services
77
ms.subservice: call-automation
8-
ms.date: 07/15/2024
8+
ms.date: 11/24/2024
99
ms.topic: include
1010
ms.topic: Include file
1111
ms.author: alvinhan
@@ -22,118 +22,197 @@ ms.author: alvinhan
2222
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection.
2323
You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this [quickstart](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/).
2424

25-
## Establish a call
26-
Establish a call and provide streaming details
25+
## Receiving and Sending audio streaming data
26+
There are multiple ways to start receiving audio stream, which can be configured using the `startMediaStreaming` flag in the `mediaStreamingOptions` setup. You can also specify the desired sample rate used for recieving or sending audio data using the `audioFormat` parameter. Currently supported formats are PCM 24K mono and PCM 16K mono, with the default being PCM 16K mono.
27+
28+
To enable bidirectional audio streaming, where you're sending audio data into the call, you can enable the `EnableBidirectional` flag.
29+
30+
### Start streaming audio to your webserver at time of answering the call
31+
Enable automatic audio streaming when the call is established by setting the flag `startMediaStreaming: true`.
32+
33+
This ensures that audio streaming starts automatically as soon as the call is connected.
2734

2835
``` C#
29-
MediaStreamingOptions mediaStreamingOptions = new MediaStreamingOptions(
30-
new Uri("<WEBSOCKET URL>"),
31-
MediaStreamingContent.Audio,
32-
MediaStreamingAudioChannel.Mixed,
33-
MediaStreamingTransport.Websocket,
34-
false);
35-
36-
var createCallOptions = new CreateCallOptions(callInvite, callbackUri)
37-
{
38-
CallIntelligenceOptions = new CallIntelligenceOptions() { CognitiveServicesEndpoint = new Uri(cognitiveServiceEndpoint) },
39-
MediaStreamingOptions = mediaStreamingOptions,
40-
};
41-
42-
CreateCallResult createCallResult = await callAutomationClient.CreateCallAsync(createCallOptions);
36+
var mediaStreamingOptions = new MediaStreamingOptions(
37+
new Uri("wss://YOUR_WEBSOCKET_URL"),
38+
MediaStreamingContent.Audio,
39+
MediaStreamingAudioChannel.Mixed,
40+
startMediaStreaming: true) {
41+
EnableBidirectional = true,
42+
AudioFormat = AudioFormat.Pcm24KMono
43+
}
44+
var options = new AnswerCallOptions(incomingCallContext, callbackUri) {
45+
MediaStreamingOptions = mediaStreamingOptions,
46+
};
47+
48+
AnswerCallResult answerCallResult = await client.AnswerCallAsync(options);
4349
```
4450

45-
## Start audio streaming
46-
How to start audio streaming:
51+
When Azure Communication Services receives the URL for your WebSocket server, it establishes a connection to it. Once the connection is successfully made, streaming is initiated.
52+
53+
54+
### Start streaming audio to your webserver while a call is in progress
55+
To start media streaming during the call, you can use the API. To do so, set the `startMediaStreaming` parameter to `false` (which is the default), and later in the call, you can use the start API to enable media streaming.
56+
4757
``` C#
48-
StartMediaStreamingOptions options = new StartMediaStreamingOptions()
49-
{
50-
OperationCallbackUri = new Uri(callbackUriHost),
51-
OperationContext = "startMediaStreamingContext"
52-
};
53-
await callMedia.StartMediaStreamingAsync(options);
54-
```
55-
When Azure Communication Services receives the URL for your WebSocket server, it creates a connection to it. Once Azure Communication Services successfully connects to your WebSocket server and streaming is started, it will send through the first data packet, which contains metadata about the incoming media packets.
56-
57-
The metadata packet will look like this:
58-
``` code
59-
{
60-
"kind": <string> // What kind of data this is, e.g. AudioMetadata, AudioData.
61-
"audioMetadata": {
62-
"subscriptionId": <string>, // unique identifier for a subscription request
63-
"encoding":<string>, // PCM only supported
64-
"sampleRate": <int>, // 16000 default
65-
"channels": <int>, // 1 default
66-
"length": <int> // 640 default
67-
}
68-
}
58+
var mediaStreamingOptions = new MediaStreamingOptions(
59+
new Uri("wss://<YOUR_WEBSOCKET_URL"),
60+
MediaStreamingContent.Audio,
61+
MediaStreamingAudioChannel.Mixed,
62+
startMediaStreaming: false) {
63+
EnableBidirectional = true,
64+
AudioFormat = AudioFormat.Pcm24KMono
65+
}
66+
var options = new AnswerCallOptions(incomingCallContext, callbackUri) {
67+
MediaStreamingOptions = mediaStreamingOptions,
68+
};
69+
70+
AnswerCallResult answerCallResult = await client.AnswerCallAsync(options);
71+
72+
Start media streaming via API call
73+
StartMediaStreamingOptions options = new StartMediaStreamingOptions() {
74+
OperationContext = "startMediaStreamingContext"
75+
};
76+
77+
await callMedia.StartMediaStreamingAsync();
6978
```
7079

7180

7281
## Stop audio streaming
73-
How to stop audio streaming
82+
To stop recieving audio streams during a call, you can use the **Stop streaming API**. This allows you to stop the audio streaming at any point in the call. There are two ways that audio streaming can be stopped;
83+
1. **Triggering the Stop streaming API:** Use the API to stop receiving audio streaming data while the call is still active.
84+
2. **Automatic stop on call disconnect:** Audio streaming will automatically stop when the call is disconnected.
85+
7486
``` C#
75-
StopMediaStreamingOptions stopOptions = new StopMediaStreamingOptions()
76-
{
77-
OperationCallbackUri = new Uri(callbackUriHost)
78-
};
79-
await callMedia.StopMediaStreamingAsync(stopOptions);
87+
StopMediaStreamingOptions options = new StopMediaStreamingOptions() {
88+
OperationContext = "stopMediaStreamingContext"
89+
};
90+
91+
await callMedia.StopMediaStreamingAsync();
8092
```
8193

8294
## Handling audio streams in your websocket server
8395
The sample below demonstrates how to listen to audio streams using your websocket server.
8496

8597
``` C#
86-
HttpListener httpListener = new HttpListener();
87-
httpListener.Prefixes.Add("http://localhost:80/");
88-
httpListener.Start();
89-
90-
while (true)
91-
{
92-
HttpListenerContext httpListenerContext = await httpListener.GetContextAsync();
93-
if (httpListenerContext.Request.IsWebSocketRequest)
94-
{
95-
WebSocketContext websocketContext;
96-
try
97-
{
98-
websocketContext = await httpListenerContext.AcceptWebSocketAsync(subProtocol: null);
99-
}
100-
catch (Exception ex)
101-
{
102-
return;
103-
}
104-
WebSocket webSocket = websocketContext.WebSocket;
105-
try
106-
{
107-
while (webSocket.State == WebSocketState.Open || webSocket.State == WebSocketState.CloseSent)
108-
{
109-
byte[] receiveBuffer = new byte[2048];
110-
var cancellationToken = new CancellationTokenSource(TimeSpan.FromSeconds(60)).Token;
111-
WebSocketReceiveResult receiveResult = await webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), cancellationToken);
112-
if (receiveResult.MessageType != WebSocketMessageType.Close)
113-
{
114-
var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0');
115-
try
116-
{
117-
var eventData = JsonConvert.DeserializeObject<AudioBaseClass>(data);
118-
if (eventData != null)
119-
{
120-
if(eventData.kind == "AudioMetadata")
121-
{
122-
//Process audio metadata
123-
}
124-
else if(eventData.kind == "AudioData")
125-
{
126-
//Process audio data
127-
var byteArray = eventData.audioData.data;
128-
//use audio byteArray as you want
129-
}
130-
}
131-
}
132-
catch { }
133-
}
134-
}
135-
}
136-
catch (Exception ex) { }
137-
}
138-
}
98+
public async Task StartReceivingFromAcsMediaWebSocket() {
99+
if (m_webSocket == null) {
100+
return;
101+
}
102+
try {
103+
while (m_webSocket.State == WebSocketState.Open || m_webSocket.State == WebSocketState.Closed) {
104+
byte[] receiveBuffer = new byte[2048];
105+
WebSocketReceiveResult receiveResult = await m_webSocket.ReceiveAsync(new ArraySegment < byte > (receiveBuffer), m_cts.Token);
106+
107+
if (receiveResult.MessageType != WebSocketMessageType.Close) {
108+
string data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0');
109+
var input = StreamingData.Parse(data, AudioData.class);
110+
if (input is AudioData audioData) {
111+
using(var ms = new MemoryStream(audioData.Data)) {
112+
// Forward audio data to external AI
113+
await m_aiServiceHandler.SendAudioToExternalAI(ms);
114+
}
115+
}
116+
}
117+
}
118+
} catch (Exception ex) {
119+
Console.WriteLine($"Exception -> {ex}");
120+
}
121+
}
122+
```
123+
124+
The first packet you receive will contain metadata about the streaming, including audio settings such as encoding, sample rate, and other configuration details.
125+
126+
``` json
127+
{
128+
"kind": "AudioMetadata",
129+
"audioMetadata": {
130+
"subscriptionId": "89e8cb59-b991-48b0-b154-1db84f16a077",
131+
"encoding": "PCM",
132+
"sampleRate": 16000,
133+
"channels": 1,
134+
"length": 640
135+
}
136+
}
137+
```
138+
139+
After sending the metadata packet, Azure Communication Services (ACS) will begin streaming audio media to your WebSocket server.
140+
141+
``` json
142+
{
143+
"kind": "AudioData",
144+
"audioData": {
145+
"timestamp": "2024-11-15T19:16:12.925Z",
146+
"participantRawID": "8:acs:3d20e1de-0f28-41c5…",
147+
"data": "5ADwAOMA6AD0A…",
148+
"silent": false
149+
}
150+
}
151+
```
152+
153+
## Sending audio streaming data to Azure Communication Services
154+
If bidirectional streaming is enabled using the `EnableBidirectional` flag in the `MediaStreamingOptions`, you can stream audio data back to Azure Communication Services, which will play the audio into the call.
155+
156+
Once Azure Communication Services begins streaming audio to your WebSocket server, you can relay the audio to the LLM and vice versa. After the LLM processes the audio content, it streams the response back to your service, which you can then send into the Azure Communication Services call.
157+
158+
The example below demonstrates how to transmit the audio data back into the call after it has been processed by another service, for instance Azure OpenAI or other such voice based Large Language Models.
159+
160+
``` C#
161+
private void ConvertToAcsAudioPacketAndForward(byte[] audioData) {
162+
var audio = OutStreamingData.GetStreamingDataForOutbound(audioData)
163+
164+
// Serialize the JSON object to a string
165+
string jsonString = System.Text.Json.JsonSerializer.Serialize < OutStreamingData > (audio);
166+
167+
// queue it to the buffer
168+
try {
169+
m_channel.Writer.TryWrite(async () => await m_mediaStreaming.SendMessageAsync(jsonString));
170+
} catch (Exception ex) {
171+
Console.WriteLine($"\"Exception received on ReceiveAudioForOutBound {ex}");
172+
}
173+
}
174+
175+
public async Task SendMessageAsync(string message) {
176+
if (m_webSocket?.State == WebSocketState.Open) {
177+
byte[] jsonBytes = Encoding.UTF8.GetBytes(message);
178+
179+
// Send the PCM audio chunk over WebSocket
180+
await m_webSocket.SendAsync(new ArraySegment < byte > (jsonBytes), WebSocketMessageType.Text, endOfMessage: true, CancellationToken.None);
181+
}
182+
}
139183
```
184+
185+
You can also control the playback of audio in the call when streaming back to Azure Communication Services, based on your logic or business flow. For example, when voice activity is detected and you want to stop the queued up audio, you can send a stop message via the WebSocket to stop the audio from playing in the call.
186+
187+
``` C#
188+
private void StopAudio() {
189+
try {
190+
var jsonObject = OutStreamingData.GetStopAudioForOutbound();
191+
192+
// Serialize the JSON object to a string
193+
string jsonString = System.Text.Json.JsonSerializer.Serialize < OutStreamingData > (jsonObject);
194+
195+
try {
196+
m_channel.Writer.TryWrite(async () => await m_mediaStreaming.SendMessageAsync(jsonString));
197+
} catch (Exception ex) {
198+
Console.WriteLine($"\"Exception received on ReceiveAudioForOutBound {ex}");
199+
}
200+
} catch (Exception ex) {
201+
Console.WriteLine($"Exception during streaming -> {ex}");
202+
}
203+
}
204+
205+
public async Task SendMessageAsync(string message) {
206+
if (m_webSocket?.State == WebSocketState.Open) {
207+
byte[] jsonBytes = Encoding.UTF8.GetBytes(message);
208+
209+
// Send the PCM audio chunk over WebSocket
210+
await m_webSocket.SendAsync(new ArraySegment < byte > (jsonBytes), WebSocketMessageType.Text, endOfMessage: true, CancellationToken.None);
211+
}
212+
}
213+
```
214+
215+
216+
217+
218+

0 commit comments

Comments
 (0)