Skip to content

Commit 0bc9f2d

Browse files
Merge pull request #291148 from valindrae/public-preview-1126
Public preview 1126
2 parents b956875 + 5ee5ada commit 0bc9f2d

File tree

3 files changed

+269
-218
lines changed

3 files changed

+269
-218
lines changed

articles/communication-services/how-tos/call-automation/audio-streaming-quickstart.md

Lines changed: 1 addition & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -29,30 +29,14 @@ Functionality described in this quickstart is currently in public preview.
2929
::: zone-end
3030

3131
::: zone pivot="programming-language-javascript"
32-
[!INCLUDE [Audio Streaming with JavaScript](./includes/audio-streaming-quickstart-js.md)]
32+
[!INCLUDE [Audio Streaming with javaScript](./includes/audio-streaming-quickstart-js.md)]
3333
::: zone-end
3434

3535
::: zone pivot="programming-language-python"
3636
[!INCLUDE [Audio Streaming with Python](./includes/audio-streaming-quickstart-python.md)]
3737
::: zone-end
3838

3939

40-
## Audio streaming schema
41-
After sending through the metadata packet, Azure Communication Services will start streaming audio media to your WebSocket server. Below is an example of what the media object your server will receive looks like.
42-
43-
``` code
44-
{
45-
"kind": <string>, // What kind of data this is, e.g. AudioMetadata, AudioData.
46-
"audioData":{
47-
"data": <string>, // Base64 Encoded audio buffer data
48-
"timestamp": <string>, // In ISO 8601 format (yyyy-mm-ddThh:mm:ssZ)
49-
"participantRawID": <string>,
50-
"silent": <boolean> // Indicates if the received audio buffer contains only silence.
51-
}
52-
}
53-
```
54-
55-
5640
## Clean up resources
5741

5842
If you want to clean up and remove a Communication Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it. Learn more about [cleaning up resources](../../quickstarts/create-communication-resource.md#clean-up-resources).

articles/communication-services/how-tos/call-automation/includes/audio-streaming-quickstart-java.md

Lines changed: 104 additions & 107 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: azure-communication-services
55
author: Alvin
66
ms.service: azure-communication-services
77
ms.subservice: call-automation
8-
ms.date: 07/15/2024
8+
ms.date: 11/26/2024
99
ms.topic: include
1010
ms.topic: Include file
1111
ms.author: alvinhan
@@ -16,141 +16,138 @@ ms.author: alvinhan
1616
- Azure account with an active subscription, for details see [Create an account for free.](https://azure.microsoft.com/free/)
1717
- An Azure Communication Services resource. See [Create an Azure Communication Services resource](../../../quickstarts/create-communication-resource.md?tabs=windows&pivots=platform-azp).
1818
- A new web service application created using the [Call Automation SDK](../../../quickstarts/call-automation/callflows-for-customer-interactions.md).
19-
- [Java Development Kit](/java/azure/jdk/?preserve-view=true&view=azure-java-stable) version 8 or above.
19+
- [Java Development Kit](/java/azure/jdk/?preserve-view=true&view=azure-java-stable) version 17 or above.
2020
- [Apache Maven](https://maven.apache.org/download.cgi).
2121

2222
## Set up a websocket server
2323
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection.
24-
You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this [quickstart](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/).
2524

26-
## Establish a call
27-
Establish a call and provide streaming details
25+
You can review documentation [here](https://azure.microsoft.com/blog/introduction-to-websockets-on-windows-azure-web-sites/) to learn more about WebSockets and how to use them.
26+
27+
## Receiving and sending audio streaming data
28+
There are multiple ways to start receiving audio stream, which can be configured using the `startMediaStreaming` flag in the `mediaStreamingOptions` setup. You can also specify the desired sample rate used for receiving or sending audio data using the `audioFormat` parameter. Currently supported formats are PCM 24K mono and PCM 16K mono, with the default being PCM 16K mono.
29+
30+
To enable bidirectional audio streaming, where you're sending audio data into the call, you can enable the `EnableBidirectional` flag. For more details, refer to the [API specifications](https://learn.microsoft.com/rest/api/communication/callautomation/answer-call/answer-call?view=rest-communication-callautomation-2024-06-15-preview&tabs=HTTP#mediastreamingoptions).
31+
32+
### Start streaming audio to your webserver at time of answering the call
33+
Enable automatic audio streaming when the call is established by setting the flag `startMediaStreaming: true`.
34+
35+
This setting ensures that audio streaming starts automatically as soon as the call is connected.
2836

2937
``` Java
30-
CallInvite callInvite = new CallInvite(target, caller); 
31-
             
32-
            CallIntelligenceOptions callIntelligenceOptions = new CallIntelligenceOptions().setCognitiveServicesEndpoint(appConfig.getCognitiveServiceEndpoint()); 
33-
            MediaStreamingOptions mediaStreamingOptions = new MediaStreamingOptions(appConfig.getWebSocketUrl(), MediaStreamingTransport.WEBSOCKET, MediaStreamingContentType.AUDIO, MediaStreamingAudioChannel.UNMIXED); 
34-
            mediaStreamingOptions.setStartMediaStreaming(false); 
35-
         
36-
            CreateCallOptions createCallOptions = new CreateCallOptions(callInvite, appConfig.getCallBackUri()); 
37-
            createCallOptions.setCallIntelligenceOptions(callIntelligenceOptions); 
38-
            createCallOptions.setMediaStreamingOptions(mediaStreamingOptions); 
39-
40-
            Response<CreateCallResult> result = client.createCallWithResponse(createCallOptions, Context.NONE); 
41-
            return result.getValue().getCallConnectionProperties().getCallConnectionId(); 
38+
MediaStreamingOptions mediaStreamingOptions = new MediaStreamingOptions(appConfig.getTransportUrl(), MediaStreamingTransport.WEBSOCKET, MediaStreamingContent.AUDIO, MediaStreamingAudioChannel.MIXED, true).setEnableBidirectional(true).setAudioFormat(AudioFormat.PCM_24K_MONO);
39+
options = new AnswerCallOptions(data.getString(INCOMING_CALL_CONTEXT), callbackUri).setCallIntelligenceOptions(callIntelligenceOptions).setMediaStreamingOptions(mediaStreamingOptions);
40+
Response answerCallResponse = client.answerCallWithResponse(options, Context.NONE);
4241
```
42+
When Azure Communication Services receives the URL for your WebSocket server, it establishes a connection to it. Once the connection is successfully made, streaming is initiated.
43+
4344

44-
## Start audio streaming
45+
### Start streaming audio to your webserver while a call is in progress
46+
To start media streaming during the call, you can use the API. To do so, set the `startMediaStreaming` parameter to `false` (which is the default), and later in the call, you can use the start API to enable media streaming.
4547

46-
How to start audio streaming:
4748
``` Java
48-
StartMediaStreamingOptions startOptions = new StartMediaStreamingOptions() 
49-
                                                        .setOperationContext("startMediaStreamingContext") 
50-
                                                        .setOperationCallbackUrl(appConfig.getBasecallbackuri()); 
51-
         client.getCallConnection(callConnectionId) 
52-
                     .getCallMedia() 
53-
                     .startMediaStreamingWithResponse(startOptions, Context.NONE);     
54-
```
55-
When Azure Communication Services receives the URL for your WebSocket server, it creates a connection to it. Once Azure Communication Services successfully connects to your WebSocket server and streaming is started, it will send through the first data packet, which contains metadata about the incoming media packets.
56-
57-
The metadata packet will look like this:
58-
``` Code
59-
{
60-
"kind": <string> // What kind of data this is, e.g. AudioMetadata, AudioData.
61-
"audioMetadata": {
62-
"subscriptionId": <string>, // unique identifier for a subscription request
63-
"encoding":<string>, // PCM only supported
64-
"sampleRate": <int>, // 16000 default
65-
"channels": <int>, // 1 default
66-
"length": <int> // 640 default
67-
}
68-
}
69-
```
49+
MediaStreamingOptions mediaStreamingOptions = new MediaStreamingOptions(appConfig.getTransportUrl(), MediaStreamingTransport.WEBSOCKET, MediaStreamingContent.AUDIO, MediaStreamingAudioChannel.MIXED, false)
50+
.setEnableBidirectional(true)
51+
.setAudioFormat(AudioFormat.PCM_24K_MONO);
52+
53+
options = new AnswerCallOptions(data.getString(INCOMING_CALL_CONTEXT), callbackUri)
54+
.setCallIntelligenceOptions(callIntelligenceOptions)
55+
.setMediaStreamingOptions(mediaStreamingOptions);
7056

57+
Response answerCallResponse = client.answerCallWithResponse(options, Context.NONE);
58+
59+
StartMediaStreamingOptions startMediaStreamingOptions = new StartMediaStreamingOptions()
60+
.setOperationContext("startMediaStreamingContext");
61+
62+
callConnection.getCallMedia().startMediaStreamingWithResponse(startMediaStreamingOptions, Context.NONE);    
63+
```
7164

7265
## Stop audio streaming
73-
How to stop audio streaming
66+
To stop receiving audio streams during a call, you can use the **Stop streaming API**. This allows you to stop the audio streaming at any point in the call. There are two ways that audio streaming can be stopped;
67+
- **Triggering the Stop streaming API:** Use the API to stop receiving audio streaming data while the call is still active.
68+
- **Automatic stop on call disconnect:** Audio streaming automatically stops when the call is disconnected.
69+
7470
``` Java
75-
StopMediaStreamingOptions stopOptions = new StopMediaStreamingOptions() 
76-
                                                        .setOperationCallbackUrl(appConfig.getBasecallbackuri()); 
77-
         client.getCallConnection(callConnectionId) 
78-
                     .getCallMedia() 
79-
                     .stopMediaStreamingWithResponse(stopOptions, Context.NONE);
71+
StopMediaStreamingOptions stopMediaStreamingOptions = new StopMediaStreamingOptions()
72+
.setOperationContext("stopMediaStreamingContext");
73+
callConnection.getCallMedia().stopMediaStreamingWithResponse(stopMediaStreamingOptions, Context.NONE);
8074
```
8175

82-
## Handling media streams in your websocket server
83-
The sample below demonstrates how to listen to media stream using your websocket server. There will be two files that need to be run: App.java and WebSocketServer.java
76+
## Handling audio streams in your websocket server
77+
This sample demonstrates how to listen to audio streams using your websocket server.
78+
79+
``` Java
80+
@OnMessage
81+
public void onMessage(String message, Session session) {
82+
System.out.println("Received message: " + message);
83+
var parsedData = StreamingData.parse(message);
84+
if (parsedData instanceof AudioData) {
85+
var audioData = (AudioData) parsedData;
86+
sendAudioData(session, audioData.getData());
87+
}
88+
}
89+
```
8490

85-
``` App.java
86-
package com.example;
91+
The first packet you receive contains metadata about the stream, including audio settings such as encoding, sample rate, and other configuration details.
92+
93+
``` json
94+
{
95+
"kind": "AudioMetadata",
96+
"audioMetadata": {
97+
"subscriptionId": "89e8cb59-b991-48b0-b154-1db84f16a077",
98+
"encoding": "PCM",
99+
"sampleRate": 16000,
100+
"channels": 1,
101+
"length": 640
102+
}
103+
}
104+
```
87105

88-
import org.glassfish.tyrus.server.Server;
106+
After sending the metadata packet, Azure Communication Services (ACS) will begin streaming audio media to your WebSocket server.
107+
108+
``` json
109+
{
110+
"kind": "AudioData",
111+
"audioData": {
112+
"timestamp": "2024-11-15T19:16:12.925Z",
113+
"participantRawID": "8:acs:3d20e1de-0f28-41c5…",
114+
"data": "5ADwAOMA6AD0A…",
115+
"silent": false
116+
}
117+
}
118+
```
89119

90-
import java.io.BufferedReader;
91-
import java.io.InputStreamReader;
120+
## Sending audio streaming data to Azure Communication Services
121+
If bidirectional streaming is enabled using the `EnableBidirectional` flag in the `MediaStreamingOptions`, you can stream audio data back to Azure Communication Services, which plays the audio into the call.
92122

93-
public class App {
94-
public static void main(String[] args) {
123+
Once Azure Communication Services begins streaming audio to your WebSocket server, you can relay the audio to your AI services. After your AI service processes the audio content, you can stream the audio back to the ongoing call in Azure Communication Services.
95124

96-
Server server = new Server("localhost", 8081, "/ws", null, WebSocketServer.class);
125+
The example demonstrates how another service, such as Azure OpenAI or other voice-based Large Language Models, processes and transmits the audio data back into the call.
97126

127+
``` Java
128+
private void sendAudioData(Session session, byte[] binaryData) {
129+
System.out.println("Data buffer---> " + binaryData.getClass().getName());
130+
if (session.isOpen()) {
98131
try {
99-
server.start();
100-
System.out.println("Web socket running on port 8081...");
101-
System.out.println("wss://localhost:8081/ws/server");
102-
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
103-
reader.readLine();
104-
} catch (Exception e) {
132+
var serializedData = OutStreamingData.getStreamingDataForOutbound(binaryData);
133+
session.getAsyncRemote().sendText(serializedData);
134+
} catch (IOException e) {
105135
e.printStackTrace();
106-
} finally {
107-
server.stop();
108136
}
109137
}
110138
}
111139
```
112-
``` WebSocketServer.java
113-
package com.example;
114-
115-
import javax.websocket.OnMessage;
116-
import javax.websocket.Session;
117-
import javax.websocket.server.ServerEndpoint;
118-
119-
import com.azure.communication.callautomation.models.streaming.StreamingData;
120-
import com.azure.communication.callautomation.models.streaming.StreamingDataParser;
121-
import com.azure.communication.callautomation.models.streaming.media.AudioData;
122-
import com.azure.communication.callautomation.models.streaming.media.AudioMetadata;
123-
124-
@ServerEndpoint("/server")
125-
public class WebSocketServer {
126-
@OnMessage
127-
public void onMessage(String message, Session session) {
128-
129-
// System.out.println("Received message: " + message);
130-
131-
StreamingData data = StreamingDataParser.parse(message);
132-
133-
if (data instanceof AudioMetadata) {
134-
AudioMetadata audioMetaData = (AudioMetadata) data;
135-
System.out.println("----------------------------------------------------------------");
136-
System.out.println("SUBSCRIPTION ID:-->" + audioMetaData.getMediaSubscriptionId());
137-
System.out.println("ENCODING:-->" + audioMetaData.getEncoding());
138-
System.out.println("SAMPLE RATE:-->" + audioMetaData.getSampleRate());
139-
System.out.println("CHANNELS:-->" + audioMetaData.getChannels());
140-
System.out.println("LENGTH:-->" + audioMetaData.getLength());
141-
System.out.println("----------------------------------------------------------------");
142-
}
143-
if (data instanceof AudioData) {
144-
System.out.println("----------------------------------------------------------------");
145-
AudioData audioData = (AudioData) data;
146-
System.out.println("DATA:-->" + audioData.getData());
147-
System.out.println("TIMESTAMP:-->" + audioData.getTimestamp());
148-
// System.out.println("PARTICIPANT:-->" + audioData.getParticipant().getRawId()
149-
// != null
150-
// ? audioData.getParticipant().getRawId()
151-
// : "");
152-
System.out.println("IS SILENT:-->" + audioData.isSilent());
153-
System.out.println("----------------------------------------------------------------");
140+
141+
You can also control the playback of audio in the call when streaming back to Azure Communication Services, based on your logic or business flow. For example, when voice activity is detected and you want to stop the queued up audio, you can send a stop message via the WebSocket to stop the audio from playing in the call.
142+
143+
``` Java
144+
private void stopAudio(Session session) {
145+
if (session.isOpen()) {
146+
try {
147+
var serializedData = OutStreamingData.getStopAudioForOutbound();
148+
session.getAsyncRemote().sendText(serializedData);
149+
} catch (IOException e) {
150+
e.printStackTrace();
154151
}
155152
}
156153
}

0 commit comments

Comments
 (0)