You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/how-to/custom-avatar/create-avatar/ai-foundry.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ author: eric-urban
3
3
ms.author: eur
4
4
ms.service: azure-ai-speech
5
5
ms.topic: include
6
-
ms.date: 5/19/2025
6
+
ms.date: 08/07/2025
7
7
---
8
8
9
9
Getting started with a custom text to speech avatar is a straightforward process. All it takes are a few video clips of your actor. If you'd like to train a [custom voice](../../../../custom-neural-voice.md) for the same actor, you can do so separately.
@@ -55,7 +55,7 @@ To add an avatar talent profile and upload their consent statement in your proje
55
55
1. Select **Set up avatar talent** > **Upload consent video**.
56
56
57
57
1. On the **Upload consent video** page, follow the instructions to upload the avatar talent consent video you recorded beforehand.
58
-
- Select the avatar type to build. Build a voice sync for avatar which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
58
+
- Select the avatar type to build. Build a voice sync for avatar, which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
59
59
- Select the speaking language of the verbal consent statement recorded by the avatar talent.
60
60
- Enter the avatar talent name and your company name in the same language as the recorded statement.
61
61
- The avatar talent name must be the name of the person who recorded the consent statement.
@@ -90,7 +90,7 @@ To upload training data, follow these steps:
90
90
91
91
Data files are automatically validated when you select **Upload**. Data validation includes series of checks on the video files to verify their file format, size, and total volume. If there are any errors, fix them and submit again.
92
92
93
-
After you upload the data, you can check the data overview which indicates whether you provided enough data to start training.
93
+
After you upload the data, you can check the data overview, which indicates whether you provided enough data to start training.
94
94
95
95
## Step 4: Train your avatar model
96
96
@@ -102,7 +102,7 @@ To create a custom avatar in the Azure AI Foundry portal, follow these steps for
102
102
1. Select **Fine-tuning** from the left pane and then select **AI Service fine-tuning**.
103
103
1. Select the custom avatar fine-tuning task (by model name) that you [started as described in the previous section](#step-1-start-fine-tuning).
104
104
1. Select **Train model** > **+ Train model**.
105
-
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and SSML input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
105
+
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and speech synthesis markup language (SSML) input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
106
106
107
107
> [!IMPORTANT]
108
108
> The avatar model name must be unique within the same Speech or AI Services resource.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/how-to/custom-avatar/create-avatar/speech-studio.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ author: eric-urban
3
3
ms.author: eur
4
4
ms.service: azure-ai-speech
5
5
ms.topic: include
6
-
ms.date: 5/19/2025
6
+
ms.date: 08/07/2025
7
7
---
8
8
9
9
Getting started with a custom text to speech avatar is a straightforward process. All it takes are a few video clips of your actor. If you'd like to train a [custom voice](../../../../custom-neural-voice.md) for the same actor, you can do so separately.
@@ -54,7 +54,7 @@ To add an avatar talent profile and upload their consent statement in your proje
54
54
1. Sign in to the [Speech Studio](https://speech.microsoft.com).
55
55
1. Select **Custom avatar** > Your project name > **Set up avatar talent** > **Upload consent video**.
56
56
1. On the **Upload consent video** page, follow the instructions to upload the avatar talent consent video you recorded beforehand.
57
-
- Select the avatar type to build. Build a voice sync for avatar which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
57
+
- Select the avatar type to build. Build a voice sync for avatar, which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
58
58
- Select the speaking language of the verbal consent statement recorded by the avatar talent.
59
59
- Enter the avatar talent name and your company name in the same language as the recorded statement.
60
60
- The avatar talent name must be the name of the person who recorded the consent statement.
@@ -90,7 +90,7 @@ To upload training data, follow these steps:
90
90
91
91
Data files are automatically validated when you select **Submit**. Data validation includes series of checks on the video files to verify their file format, size, and total volume. If there are any errors, fix them and submit again.
92
92
93
-
After you upload the data, you can check the data overview which indicates whether you provided enough data to start training. This screenshot shows an example of enough data added for training an avatar without other gestures.
93
+
After you upload the data, you can check the data overview, which indicates whether you provided enough data to start training. This screenshot shows an example of enough data added for training an avatar without other gestures.
94
94
95
95
:::image type="content" source="../../../../media/custom-avatar/speech-studio/review-training-data.png" alt-text="Screenshot of enough data added for training an avatar without other gestures." lightbox="../../../../media/custom-avatar/speech-studio/review-training-data.png":::
96
96
@@ -102,7 +102,7 @@ After you upload the data, you can check the data overview which indicates wheth
102
102
To create a custom avatar in Speech Studio, follow these steps for one of the following methods:
103
103
1. Sign in to the [Speech Studio](https://speech.microsoft.com).
104
104
1. Select **Custom avatar** > Your project name > **Train model** > **Train model**.
105
-
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and SSML input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
105
+
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and speech synthesis markup language (SSML) input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
106
106
107
107
> [!IMPORTANT]
108
108
> The avatar model name must be unique within the same Speech or AI Services resource.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -500,7 +500,7 @@ Personal voice is now generally available. With personal voice, you can get AI g
500
500
501
501
#### Text to speech avatar
502
502
503
-
- You can now set a static background image for your avatars. To utilize this feature, simply use the `avatarConfig.backgroundImage` property and specify a URL pointing to the desired image. For details, refer to [How to edit the background](../../text-to-speech-avatar/batch-synthesis-avatar-properties.md#how-to-edit-the-background).
503
+
- You can now set a static background image for your avatars. To utilize this feature, simply use the `avatarConfig.backgroundImage` property and specify a URL pointing to the desired image. For details, refer to [batch synthesis avatar properties](../../text-to-speech-avatar/batch-synthesis-avatar-properties.md#edit-the-background).
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/text-to-speech-avatar/avatar-gestures-with-ssml.md
+13-14Lines changed: 13 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,15 +5,15 @@ description: Learn how to edit text to speech avatar gestures with SSML.
5
5
manager: nitinme
6
6
ms.service: azure-ai-speech
7
7
ms.topic: how-to
8
-
ms.date: 4/28/2025
8
+
ms.date: 08/07/2025
9
9
ms.reviewer: eur
10
10
ms.author: eur
11
11
author: eric-urban
12
12
---
13
13
14
14
# Customize text to speech avatar gestures with SSML
15
15
16
-
The [Speech Synthesis Markup Language (SSML)](../speech-synthesis-markup-structure.md) with input text determines the structure, content, and other characteristics of the text to speech output. Most SSML tags can also work in text to speech avatar. Furthermore, text to speech avatar batch mode provides avatar gestures insertion ability by using the SSML bookmark element with the format `<bookmark mark='gesture.*'/>`.
16
+
The [Speech Synthesis Markup Language (SSML)](../speech-synthesis-markup-structure.md) with input text determines the structure, content, and other characteristics of the text to speech output. Most SSML tags also work in text to speech avatar. Furthermore, text to speech avatar batch mode provides avatar gesture insertion by using the SSML bookmark element with the format `<bookmark mark='gesture.*'/>`.
17
17
18
18
A gesture starts at the insertion point in time. If the gesture takes more time than the audio, the gesture is cut at the point in time when the audio is finished.
19
19
@@ -29,29 +29,29 @@ Hello <bookmark mark='gesture.wave-left-1'/>, my name is Ava, nice to meet you!
29
29
</speak>
30
30
```
31
31
32
-
In this example, the avatar will start waving their hand at the left after the word "Hello".
32
+
In this example, the avatar starts waving their hand at the left after the word "Hello".
33
33
34
-
:::image type="content" source="./media/gesture.png" alt-text="Screenshot of displaying the standard avatar waving their hand at the left." lightbox="./media/gesture.png":::
34
+
:::image type="content" source="./media/gesture.png" alt-text="Screenshot that shows the standard avatar waving their hand at the left." lightbox="./media/gesture.png":::
35
35
36
36
> [!NOTE]
37
-
> Gesture feature is currently not supported when a voice sync for avatar is selected in a custom text to speech avatar.
37
+
> Gesture feature isn't currently supported when a voice sync for avatar is selected in a custom text to speech avatar.
38
38
39
39
## Supported standard avatar characters, styles, and gestures
40
40
41
41
The full list of standard avatar supported gestures provided here can also be found in the text to speech avatar portal.
| Lisa | graceful-sitting | wave-left-1<br>wave-left-2<br>thumbsup-left<br>show-left-1<br>show-left-2<br>show-left-3<br>show-left-4<br>show-left-5<br>show-right-1<br>show-right-2<br>show-right-3<br>show-right-4<br>show-right-5 |
52
+
| Lisa | graceful-standing ||
53
+
| Lisa | technical-sitting | wave-left-1<br>wave-left-2<br>show-left-1<br>show-left-2<br>point-left-1<br>point-left-2<br>point-left-3<br>point-left-4<br>point-left-5<br>point-left-6<br>show-right-1<br>show-right-2<br>show-right-3<br>point-right-1<br>point-right-2<br>point-right-3<br>point-right-4<br>point-right-5<br>point-right-6 |
54
+
| Lisa | technical-standing ||
55
55
| Lori | casual | 123-left<br>a-little<br>beg<br>calm-down<br>come-on<br>five-star-reviews<br>good<br>hello<br>open<br>please<br>thanks |
56
56
| Lori | graceful | 123-left<br>applaud<br>come-on<br>introduce<br>nod<br>please<br>show-left<br>show-right<br>thanks<br>welcome |
57
57
| Lori | formal | 123<br>come-on<br>come-on-left<br>down<br>five-star<br>good<br>hands-triangle<br>hands-up<br>hi<br>hopeful<br>thanks |
@@ -62,11 +62,10 @@ The full list of standard avatar supported gestures provided here can also be fo
62
62
| Meg | casual | a-little-bit<br>click-the-link<br>cross-hand<br>display-number<br>encourage-1<br>encourage-2<br>five-star-praise<br>front-left<br>front-right<br>good-1<br>good-2<br>handclap<br>introduction-to-products-1<br>introduction-to-products-2<br>introduction-to-products-3<br>left<br>length<br>lower-left<br>lower-right<br>number-one<br>press-both-hands-down<br>right<br>say-hi<br>shrug-ones-shoulders<br>slide-from-right-to-left<br>slide-to-the-left<br>spread-hands<br>the-front<br>top-middle-and-bottom-left<br>top-middle-and-bottom-right<br>upper-left<br>upper-right |
63
63
| Meg | business | a-little-bit<br>encourage-1<br>encourage-2<br>five-star-praise<br>front-left<br>front-right<br>good-1<br>good-2<br>introduction-to-products-1<br>introduction-to-products-2<br>introduction-to-products-3<br>left<br>length<br>number-one<br>press-both-hands-down-1<br>press-both-hands-down-2<br>raise-ones-hand<br>right<br>say-hi<br>shrug-ones-shoulders<br>slide-from-left-to-right<br>slide-to-the-left<br>spread-hands<br>thanks<br>the-front<br>upper-left |
64
64
65
-
All styles except `lisa-graceful-sitting`, `lisa-graceful-standing`, `lisa-technical-sitting`, and `lisa-technical-standing` are supported via the real-time text to speech API. Gestures are only supported with the batch synthesis API and aren't supported via the real-time API.
65
+
All styles except `lisa-graceful-sitting`, `lisa-graceful-standing`, `lisa-technical-sitting`, and `lisa-technical-standing` are supported via the real-time text to speech API. Gestures are only supported with the batch synthesis API and aren't supported via the real-time API.
66
66
67
67
## Next steps
68
68
69
69
*[What is text to speech avatar](what-is-text-to-speech-avatar.md)
0 commit comments