Skip to content

Commit 0ded1cb

Browse files
authored
Update audio-streaming-quickstart-js.md
Add code snippets for bidirectional audio streaming
1 parent 7549f59 commit 0ded1cb

File tree

1 file changed

+164
-94
lines changed

1 file changed

+164
-94
lines changed

articles/communication-services/how-tos/call-automation/includes/audio-streaming-quickstart-js.md

Lines changed: 164 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: azure-communication-services
55
author: Alvin
66
ms.service: azure-communication-services
77
ms.subservice: call-automation
8-
ms.date: 07/15/2024
8+
ms.date: 11/26/2024
99
ms.topic: include
1010
ms.topic: Include file
1111
ms.author: alvinhan
@@ -16,116 +16,186 @@ ms.author: alvinhan
1616
- An Azure Communication Services resource. See [Create an Azure Communication Services resource](../../../quickstarts/create-communication-resource.md?tabs=windows&pivots=platform-azp).
1717
- A new web service application created using the [Call Automation SDK](../../../quickstarts/call-automation/callflows-for-customer-interactions.md).
1818
- [Node.js](https://nodejs.org/en/) LTS installation
19-
- A websocket server that can receive media streams.
19+
- A websocket server that can send and receive media streams.
2020

2121
## Set up a websocket server
2222
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection.
23-
You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this [quickstart](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/).
2423

25-
## Establish a call
26-
Establish a call and provide streaming details
24+
You can review documentation [here](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/) to learn more about WebSockets and how to use them.
25+
26+
## Receiving and Sending audio streaming data
27+
There are multiple ways to start receiving audio stream, which can be configured using the `startMediaStreaming` flag in the `mediaStreamingOptions` setup. You can also specify the desired sample rate used for receiving or sending audio data using the `audioFormat` parameter. Currently supported formats are PCM 24K mono and PCM 16K mono, with the default being PCM 16K mono.
28+
29+
To enable bidirectional audio streaming, where you're sending audio data into the call, you can enable the `EnableBidirectional` flag. For more details, refer to the [API specifications](https://learn.microsoft.com/rest/api/communication/callautomation/answer-call/answer-call?view=rest-communication-callautomation-2024-06-15-preview&tabs=HTTP#mediastreamingoptions).
30+
31+
### Start streaming audio to your webserver at time of answering the call
32+
Enable automatic audio streaming when the call is established by setting the flag `startMediaStreaming: true`.
33+
34+
This setting ensures that audio streaming starts automatically as soon as the call is connected.
2735

2836
``` JS
29-
const mediaStreamingOptions: MediaStreamingOptions = {
30-
transportUrl: "<WEBSOCKET URL>",
31-
transportType: "websocket",
32-
contentType: "audio",
33-
audioChannelType: "unmixed",
34-
startMediaStreaming: false
35-
}
36-
const options: CreateCallOptions = {
37-
callIntelligenceOptions: { cognitiveServicesEndpoint: process.env.COGNITIVE_SERVICES_ENDPOINT },
38-
mediaStreamingOptions: mediaStreamingOptions
39-
};
37+
var mediaStreamingOptions = new MediaStreamingOptions(
38+
new Uri("wss://YOUR_WEBSOCKET_URL"),
39+
MediaStreamingContent.Audio,
40+
MediaStreamingAudioChannel.Mixed,
41+
startMediaStreaming: true)
42+
{
43+
EnableBidirectional = true,
44+
AudioFormat = AudioFormat.Pcm24KMono
45+
}
46+
var options = new AnswerCallOptions(incomingCallContext, callbackUri)
47+
{
48+
MediaStreamingOptions = mediaStreamingOptions,
49+
};
50+
51+
AnswerCallResult answerCallResult = await client.AnswerCallAsync(options);
4052
```
4153

42-
## Start audio streaming
43-
How to start audio streaming:
54+
When Azure Communication Services receives the URL for your WebSocket server, it establishes a connection to it. Once the connection is successfully made, streaming is initiated.
55+
56+
### Start streaming audio to your webserver while a call is in progress
57+
To start media streaming during the call, you can use the API. To do so, set the `startMediaStreaming` parameter to `false` (which is the default), and later in the call, you can use the start API to enable media streaming.
58+
4459
``` JS
45-
const streamingOptions: StartMediaStreamingOptions = {
46-
operationContext: "startMediaStreamingContext",
47-
operationCallbackUrl: process.env.CALLBACK_URI + "/api/callbacks"
48-
}
49-
await callMedia.startMediaStreaming(streamingOptions);
50-
```
51-
When Azure Communication Services receives the URL for your WebSocket server, it creates a connection to it. Once Azure Communication Services successfully connects to your WebSocket server and streaming is started, it will send through the first data packet, which contains metadata about the incoming media packets.
52-
53-
The metadata packet will look like this:
54-
```
55-
{
56-
"kind": <string> // What kind of data this is, e.g. AudioMetadata, AudioData.
57-
"audioMetadata": {
58-
"subscriptionId": <string>, // unique identifier for a subscription request
59-
"encoding":<string>, // PCM only supported
60-
"sampleRate": <int>, // 16000 default
61-
"channels": <int>, // 1 default
62-
"length": <int> // 640 default
63-
}
64-
}
60+
const mediaStreamingOptions: MediaStreamingOptions = {
61+
transportUrl: transportUrl,
62+
transportType: "websocket",
63+
contentType: "audio",
64+
audioChannelType: "unmixed",
65+
startMediaStreaming: false,
66+
enableBidirectional: true,
67+
audioFormat: "Pcm24KMono"
68+
}
69+
const answerCallOptions: AnswerCallOptions = {
70+
mediaStreamingOptions: mediaStreamingOptions
71+
};
72+
73+
answerCallResult = await acsClient.answerCall(
74+
incomingCallContext,
75+
callbackUri,
76+
answerCallOptions
77+
);
78+
79+
const startMediaStreamingOptions: StartMediaStreamingOptions = {
80+
operationContext: "startMediaStreaming"
81+
}
82+
83+
await answerCallResult.callConnection.getCallMedia().startMediaStreaming(startMediaStreamingOptions);
6584
```
6685

6786

6887
## Stop audio streaming
69-
How to stop audio streaming
88+
To stop receiving audio streams during a call, you can use the **Stop streaming API**. This allows you to stop the audio streaming at any point in the call. There are two ways that audio streaming can be stopped;
89+
1. **Triggering the Stop streaming API:** Use the API to stop receiving audio streaming data while the call is still active.
90+
2. **Automatic stop on call disconnect:** Audio streaming automatically stops when the call is disconnected.
91+
7092
``` JS
71-
const stopMediaStreamingOptions: StopMediaStreamingOptions = {
72-
operationCallbackUrl: process.env.CALLBACK_URI + "/api/callbacks"
73-
}
74-
await callMedia.stopMediaStreaming(stopMediaStreamingOptions);
93+
const stopMediaStreamingOptions: StopMediaStreamingOptions = {
94+
operationContext: "stopMediaStreaming"
95+
}
96+
await answerCallResult.callConnection.getCallMedia().stopMediaStreaming(stopMediaStreamingOptions);
7597
```
7698

7799
## Handling audio streams in your websocket server
78-
The sample below demonstrates how to listen to audio streams using your websocket server.
100+
This sample demonstrates how to listen to audio streams using your websocket server.
101+
102+
``` JS
103+
wss.on('connection', async (ws: WebSocket) => {
104+
console.log('Client connected');
105+
await initWebsocket(ws);
106+
await startConversation();
107+
ws.on('message', async (packetData: ArrayBuffer) => {
108+
try {
109+
if (ws.readyState === WebSocket.OPEN) {
110+
await processWebsocketMessageAsync(packetData);
111+
} else {
112+
console.warn(`ReadyState: ${ws.readyState}`);
113+
}
114+
} catch (error) {
115+
console.error('Error processing WebSocket message:', error);
116+
}
117+
});
118+
ws.on('close', () => {
119+
console.log('Client disconnected');
120+
});
121+
});
122+
123+
async function processWebsocketMessageAsync(receivedBuffer: ArrayBuffer) {
124+
const result = StreamingData.parse(receivedBuffer);
125+
const kind = StreamingData.getStreamingKind();
126+
127+
// Get the streaming data kind
128+
if (kind === StreamingDataKind.AudioData) {
129+
const audioData = (result as AudioData);
130+
// process your audio data
131+
}
132+
}
133+
```
134+
135+
The first packet you receive contains metadata about the stream, including audio settings such as encoding, sample rate, and other configuration details.
136+
137+
``` json
138+
{
139+
"kind": "AudioMetadata",
140+
"audioMetadata": {
141+
"subscriptionId": "89e8cb59-b991-48b0-b154-1db84f16a077",
142+
"encoding": "PCM",
143+
"sampleRate": 16000,
144+
"channels": 1,
145+
"length": 640
146+
}
147+
}
148+
```
149+
150+
After sending the metadata packet, Azure Communication Services (ACS) will begin streaming audio media to your WebSocket server.
151+
152+
``` json
153+
{
154+
"kind": "AudioData",
155+
"audioData": {
156+
"timestamp": "2024-11-15T19:16:12.925Z",
157+
"participantRawID": "8:acs:3d20e1de-0f28-41c5…",
158+
"data": "5ADwAOMA6AD0A…",
159+
"silent": false
160+
}
161+
}
162+
```
163+
164+
## Sending audio streaming data to Azure Communication Services
165+
If bidirectional streaming is enabled using the `EnableBidirectional` flag in the `MediaStreamingOptions`, you can stream audio data back to Azure Communication Services, which plays the audio into the call.
166+
167+
Once Azure Communication Services begins streaming audio to your WebSocket server, you can relay the audio to your AI services. After your AI service processes the audio content, you can stream the audio back to the on-going call in Azure Communication Services.
168+
169+
The example demonstrates how another service, such as Azure OpenAI or other voice-based Large Language Models, processes and transmits the audio data back into the call.
170+
171+
``` JS
172+
async function receiveAudioForOutbound(data: string) {
173+
try {
174+
const jsonData = OutStreamingData.getStreamingDataForOutbound(data);
175+
if (ws.readyState === WebSocket.OPEN) {
176+
ws.send(jsonData);
177+
} else {
178+
console.log("socket connection is not open.");
179+
}
180+
} catch (e) {
181+
console.log(e);
182+
}
183+
}
184+
```
185+
186+
You can also control the playback of audio in the call when streaming back to Azure Communication Services, based on your logic or business flow. For example, when voice activity is detected and you want to stop the queued up audio, you can send a stop message via the WebSocket to stop the audio from playing in the call.
79187

80188
``` JS
81-
import WebSocket from 'ws';
82-
import { streamingData } from '@azure/communication-call-automation/src/utli/streamingDataParser'
83-
const wss = new WebSocket.Server({ port: 8081 });
84-
85-
wss.on('connection', (ws: WebSocket) => {
86-
console.log('Client connected');
87-
ws.on('message', (packetData: ArrayBuffer) => {
88-
const decoder = new TextDecoder();
89-
const stringJson = decoder.decode(packetData);
90-
console.log("STRING JSON=>--" + stringJson)
91-
92-
//var response = streamingData(stringJson);
93-
94-
var response = streamingData(packetData);
95-
if ('locale' in response) {
96-
console.log("Transcription Metadata")
97-
console.log(response.callConnectionId);
98-
console.log(response.correlationId);
99-
console.log(response.locale);
100-
console.log(response.subscriptionId);
101-
}
102-
if ('text' in response) {
103-
console.log("Transcription Data")
104-
console.log(response.text);
105-
console.log(response.format);
106-
console.log(response.confidence);
107-
console.log(response.offset);
108-
console.log(response.duration);
109-
console.log(response.resultStatus);
110-
if ('phoneNumber' in response.participant) {
111-
console.log(response.participant.phoneNumber);
112-
}
113-
response.words.forEach(element => {
114-
console.log(element.text)
115-
console.log(element.duration)
116-
console.log(element.offset)
117-
});
118-
}
119-
});
120-
121-
ws.on('close', () => {
122-
console.log('Client disconnected');
123-
});
124-
});
125-
126-
// function processData(data: ArrayBuffer) {
127-
// const byteArray = new Uint8Array(data);
128-
// }
129-
130-
console.log('WebSocket server running on port 8081');
189+
async function stopAudio() {
190+
try {
191+
const jsonData = OutStreamingData.getStopAudioForOutbound();
192+
if (ws.readyState === WebSocket.OPEN) {
193+
ws.send(jsonData);
194+
} else {
195+
console.log("socket connection is not open.");
196+
}
197+
} catch (e) {
198+
console.log(e);
199+
}
200+
}
131201
```

0 commit comments

Comments
 (0)