You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/release-notes/release-notes-sdk.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,8 +7,14 @@ ms.author: eur
7
7
---
8
8
### 2024-November release
9
9
10
+
#### Azure AI Speech Toolkit extension for Visual Studio Code
11
+
10
12
Azure AI Speech Toolkit extension is now available for Visual Studio Code users. It contains a list of speech quick-starts and scenario samples that can be easily built and run with simple clicks. For more information, see [Azure AI Speech Toolkit in Visual Studio Code Marketplace](https://aka.ms/speech-toolkit-vscode).
11
13
14
+
#### Text to speech avatar code samples
15
+
16
+
We added text to speech avatar code samples for [Android](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/java/android/avatar) and [iOS](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/swift/ios/avatar). These samples demonstrate how to use real-time text to speech avatars in your mobile applications.
The text to speech feature in the Speech service supports a broad portfolio of [languages and voices](../language-support.md?tabs=tts). You can get the full list or try them in the [Voice Gallery](https://speech.microsoft.com/portal/voicegallery).
46
46
47
-
Specify the language or voice of `SpeechConfig` to match your input text and use the specified voice. The following code snippet shows how this technique works:
47
+
To match your input text and use the specified voice, you can set the `SpeechSynthesisLanguage` or `SpeechSynthesisVoiceName` properties in the `SpeechConfig` object. The following code snippet shows how this technique works:
The following code snippet shows how to create the WebRTC peer connection. The ICE server URL, ICE server username, and ICE server credential can all be fetched from the payload of above HTTP request.
94
+
The following code snippet shows how to create the WebRTC peer connection. The ICE server URL, ICE server username, and ICE server credential can all be fetched from the payload of the previous HTTP request.
Our real-time API disconnects after 5 minutes of avatar's idle state. Even if the avatar isn't idle and functioning normally, the real-time API will disconnect after a 10-minute connection. To ensure continuous operation of the real-time avatar for more than 10 minutes, you can enable auto-reconnect. For information about how to set up auto-reconnect, refer to this [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/README.md) (search "auto reconnect").
151
+
Our real-time API disconnects after 5 minutes of avatar's idle state. Even if the avatar isn't idle and functioning normally, the real-time API will disconnect after a 10-minute connection. To ensure continuous operation of the real-time avatar for more than 10 minutes, you can enable automatic reconnect. For information about how to set up automatic reconnect, refer to this [JavaScript sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/README.md) (search "auto reconnect").
152
152
153
153
## Synthesize talking avatar video from text input
154
154
155
-
After the above steps, you should see the avatar video being played in the web browser. The avatar is active, with eye blink and slight body movement, but not speaking yet. The avatar is waiting for text input to start speaking.
155
+
After the previous steps, you should see the avatar video being played in the web browser. The avatar is active, with eye blink and slight body movement, but not speaking yet. The avatar is waiting for text input to start speaking.
156
156
157
157
The following code snippet shows how to send text to the avatar synthesizer and let the avatar speak:
You can find end-to-end working samples on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/js/browser/avatar).
182
-
183
181
## Close the real-time avatar connection
184
182
185
-
To avoid unnecessary costs after you finish using the real-time avatar, it’s important to close the connection. There are several ways to do this:
183
+
To avoid unnecessary costs after you finish using the real-time avatar, it’s important to close the connection. There are several ways to close the connection:
186
184
187
-
- When the browser web page is closed, the WebRTC client side peer connection object will be released, and the avatar connection will be automatically closed after a few seconds.
188
-
-If the avatar remains idle for 5 minutes, the connection will be automatically closed by the avatar service.
185
+
- When the browser web page is closed, the WebRTC client side peer connection object is released. Then the avatar connection is automatically closed after a few seconds.
186
+
-The connection is automatically closed if the avatar remains idle for 5 minutes.
189
187
- You can proactively close the avatar connection by running the following code:
190
188
191
189
```javascript
192
190
avatarSynthesizer.close()
193
191
```
194
192
195
-
You can find end-to-end working samples on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/js/browser/avatar).
196
-
197
193
## Edit background
198
194
199
195
The avatar real-time synthesis API currently doesn't support setting a background image/video and only supports setting a solid-color background, without transparent background support. However, there's an alternative way to implement background customization on the client side, following these guidelines:
@@ -203,10 +199,20 @@ The avatar real-time synthesis API currently doesn't support setting a backgroun
203
199
- Capture each frame of the avatar video and apply a pixel-by-pixel calculation to set the green pixel to transparent, and draw the recalculated frame to the canvas.
204
200
- Hide the original video.
205
201
206
-
With this approach, you can get an animated canvas that plays like a video, which has a transparent background. Here's the [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/js/basic.js#L108) to demonstrate such an approach.
202
+
With this approach, you can get an animated canvas that plays like a video, which has a transparent background. Here's the [JavaScript sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/js/basic.js#L108) to demonstrate such an approach.
207
203
208
204
After you have a transparent-background avatar, you can set the background to any image or video by placing the image or video behind the canvas.
209
205
206
+
## Code samples
207
+
208
+
You can find text to speech avatar code samples in the Speech SDK repository on GitHub. The samples demonstrate how to use real-time text to speech avatars in your web applications.
0 commit comments