Skip to content

Commit 1985e5f

Browse files
Merge pull request #4084 from yinhew/patch-6
Update real-time-synthesis-avatar.md
2 parents e88fce3 + 4ad5319 commit 1985e5f

File tree

1 file changed

+58
-6
lines changed

1 file changed

+58
-6
lines changed

articles/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar.md

Lines changed: 58 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -192,24 +192,76 @@ To avoid unnecessary costs after you finish using the real-time avatar, it’s i
192192

193193
## Edit background
194194

195-
The avatar real-time synthesis API currently doesn't support setting a background image/video and only supports setting a solid-color background, without transparent background support. However, there's an alternative way to implement background customization on the client side, following these guidelines:
195+
### Set background color
196+
197+
You can set the background color of the avatar video through the `backgroundColor` property of `AvatarConfig` object. The following code snippet shows how to set the background color:
198+
199+
```JavaScript
200+
const avatarConfig = new SpeechSDK.AvatarConfig(
201+
"lisa", // Set avatar character here.
202+
"casual-sitting", // Set avatar style here.
203+
)
204+
avatarConfig.backgroundColor = '#00FF00FF' // Set background color to green
205+
```
206+
207+
> [!NOTE]
208+
> The color string should be in format `#RRGGBBAA`. And the alpha channel (`AA` part) is always ignored as we don't support transparent background for real-time avatar.
209+
210+
211+
### Set background image
212+
213+
You can set the background image of the avatar video through the `backgroundImage` property of `AvatarConfig` object. You need upload the image to a public accessible URL and then assign the URL to the `backgroundImage` property. The following code snippet shows how to set the background image:
214+
215+
```JavaScript
216+
const avatarConfig = new SpeechSDK.AvatarConfig(
217+
"lisa", // Set avatar character here.
218+
"casual-sitting", // Set avatar style here.
219+
)
220+
avatarConfig.backgroundImage = "https://www.example.com/1920-1080-image.jpg" // A public accessiable URL of the image.
221+
```
222+
223+
### Set background video
224+
225+
The avatar real-time synthesis API currently doesn't support setting background video directly. However, there's an alternative way to implement background customization on the client side, following these guidelines:
196226

197227
- Set the background color to green (for ease of matting), which the avatar real-time synthesis API supports.
198228
- Create a canvas element with the same size as the avatar video.
199229
- Capture each frame of the avatar video and apply a pixel-by-pixel calculation to set the green pixel to transparent, and draw the recalculated frame to the canvas.
200230
- Hide the original video.
201231

202-
With this approach, you can get an animated canvas that plays like a video, which has a transparent background. Here's the [JavaScript sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/js/basic.js#L108) to demonstrate such an approach.
232+
With this approach, you can get an animated canvas that plays like a video, which has a transparent background. Here's the [JavaScript sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/js/basic.js#L142) to demonstrate such an approach.
233+
234+
After you have a transparent-background avatar, you can set the background to any dynamic content (like a video) by placing the dynamic content behind the canvas.
235+
236+
## Crop video
237+
238+
The avatar video is by default in a 16:9 aspect ratio. If you want to crop the video to a different aspect ratio, you can crop the video to a rectangle subarea of the original video. You need speficy the rectangle area by giving the coordinates of its top-left vertex and bottom-right vertex. The following code snippet shows how to crop the video:
239+
240+
```JavaScript
241+
const videoFormat = new SpeechSDK.AvatarVideoFormat()
242+
const topLeftCoordinate = new SpeechSDK.Coordinate(640, 0) // coordinate of top-left vertex, with X=640, Y=0
243+
const bottomRightCoordinate = new SpeechSDK.Coordinate(1320, 1080) // coordinate of bottom-right vertex, with X=1320, Y=1080
244+
videoFormat.setCropRange(topLeftCoordinate, bottomRightCoordinate)
245+
const avatarConfig = new SpeechSDK.AvatarConfig(
246+
"lisa", // Set avatar character here.
247+
"casual-sitting", // Set avatar style here.
248+
videoFormat, // Set video format here.
249+
)
250+
```
203251

204-
After you have a transparent-background avatar, you can set the background to any image or video by placing the image or video behind the canvas.
252+
For a full sample with more context, you can go to our [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/avatar/js/basic.js) and search `crop`.
205253

206254
## Code samples
207255

208256
You can find text to speech avatar code samples in the Speech SDK repository on GitHub. The samples demonstrate how to use real-time text to speech avatars in your web applications.
209257

210-
- [JavaScript](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/js)
211-
- [Android](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/java/android/avatar)
212-
- [iOS](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/swift/ios/avatar)
258+
- Server + client
259+
- [Python (server) + JavaScript (client)](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/python/web/avatar)
260+
- [C# (server) + JavaScript (client)](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/csharp/web/avatar)
261+
- Client only
262+
- [JavaScript](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/js/browser/avatar)
263+
- [Android](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/java/android/avatar)
264+
- [iOS](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/swift/ios/avatar)
213265

214266
These samples demonstrate how to use real-time text to speech avatars in your mobile applications.
215267

0 commit comments

Comments
 (0)