Skip to content

Commit 83ef39a

Browse files
authored
Merge pull request #6429 from eric-urban/eur/speech-freshness-1
speech freshness pass for August - 1
2 parents e8a90b8 + 13df5ec commit 83ef39a

12 files changed

+175
-174
lines changed

articles/ai-services/speech-service/includes/how-to/custom-avatar/create-avatar/ai-foundry.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ author: eric-urban
33
ms.author: eur
44
ms.service: azure-ai-speech
55
ms.topic: include
6-
ms.date: 5/19/2025
6+
ms.date: 08/07/2025
77
---
88

99
Getting started with a custom text to speech avatar is a straightforward process. All it takes are a few video clips of your actor. If you'd like to train a [custom voice](../../../../custom-neural-voice.md) for the same actor, you can do so separately.
@@ -55,7 +55,7 @@ To add an avatar talent profile and upload their consent statement in your proje
5555
1. Select **Set up avatar talent** > **Upload consent video**.
5656

5757
1. On the **Upload consent video** page, follow the instructions to upload the avatar talent consent video you recorded beforehand.
58-
- Select the avatar type to build. Build a voice sync for avatar which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
58+
- Select the avatar type to build. Build a voice sync for avatar, which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
5959
- Select the speaking language of the verbal consent statement recorded by the avatar talent.
6060
- Enter the avatar talent name and your company name in the same language as the recorded statement.
6161
- The avatar talent name must be the name of the person who recorded the consent statement.
@@ -90,7 +90,7 @@ To upload training data, follow these steps:
9090

9191
Data files are automatically validated when you select **Upload**. Data validation includes series of checks on the video files to verify their file format, size, and total volume. If there are any errors, fix them and submit again.
9292

93-
After you upload the data, you can check the data overview which indicates whether you provided enough data to start training.
93+
After you upload the data, you can check the data overview, which indicates whether you provided enough data to start training.
9494

9595
## Step 4: Train your avatar model
9696

@@ -102,7 +102,7 @@ To create a custom avatar in the Azure AI Foundry portal, follow these steps for
102102
1. Select **Fine-tuning** from the left pane and then select **AI Service fine-tuning**.
103103
1. Select the custom avatar fine-tuning task (by model name) that you [started as described in the previous section](#step-1-start-fine-tuning).
104104
1. Select **Train model** > **+ Train model**.
105-
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and SSML input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
105+
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and speech synthesis markup language (SSML) input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
106106

107107
> [!IMPORTANT]
108108
> The avatar model name must be unique within the same Speech or AI Services resource.

articles/ai-services/speech-service/includes/how-to/custom-avatar/create-avatar/speech-studio.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ author: eric-urban
33
ms.author: eur
44
ms.service: azure-ai-speech
55
ms.topic: include
6-
ms.date: 5/19/2025
6+
ms.date: 08/07/2025
77
---
88

99
Getting started with a custom text to speech avatar is a straightforward process. All it takes are a few video clips of your actor. If you'd like to train a [custom voice](../../../../custom-neural-voice.md) for the same actor, you can do so separately.
@@ -54,7 +54,7 @@ To add an avatar talent profile and upload their consent statement in your proje
5454
1. Sign in to the [Speech Studio](https://speech.microsoft.com).
5555
1. Select **Custom avatar** > Your project name > **Set up avatar talent** > **Upload consent video**.
5656
1. On the **Upload consent video** page, follow the instructions to upload the avatar talent consent video you recorded beforehand.
57-
- Select the avatar type to build. Build a voice sync for avatar which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
57+
- Select the avatar type to build. Build a voice sync for avatar, which sounds like your avatar talent together with the avatar model, or build avatar without the voice sync for avatar. The option to build a voice sync for avatar is only available in the Southeast Asia, West Europe, and West US 2 regions.
5858
- Select the speaking language of the verbal consent statement recorded by the avatar talent.
5959
- Enter the avatar talent name and your company name in the same language as the recorded statement.
6060
- The avatar talent name must be the name of the person who recorded the consent statement.
@@ -90,7 +90,7 @@ To upload training data, follow these steps:
9090

9191
Data files are automatically validated when you select **Submit**. Data validation includes series of checks on the video files to verify their file format, size, and total volume. If there are any errors, fix them and submit again.
9292

93-
After you upload the data, you can check the data overview which indicates whether you provided enough data to start training. This screenshot shows an example of enough data added for training an avatar without other gestures.
93+
After you upload the data, you can check the data overview, which indicates whether you provided enough data to start training. This screenshot shows an example of enough data added for training an avatar without other gestures.
9494

9595
:::image type="content" source="../../../../media/custom-avatar/speech-studio/review-training-data.png" alt-text="Screenshot of enough data added for training an avatar without other gestures." lightbox="../../../../media/custom-avatar/speech-studio/review-training-data.png":::
9696

@@ -102,7 +102,7 @@ After you upload the data, you can check the data overview which indicates wheth
102102
To create a custom avatar in Speech Studio, follow these steps for one of the following methods:
103103
1. Sign in to the [Speech Studio](https://speech.microsoft.com).
104104
1. Select **Custom avatar** > Your project name > **Train model** > **Train model**.
105-
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and SSML input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
105+
1. Enter a Name to help you identify the model. Choose a name carefully. The model name is used as the avatar name in your synthesis request by the SDK and speech synthesis markup language (SSML) input. Only letters, numbers, hyphens, and underscores are allowed. Use a unique name for each model.
106106

107107
> [!IMPORTANT]
108108
> The avatar model name must be unique within the same Speech or AI Services resource.

articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -500,7 +500,7 @@ Personal voice is now generally available. With personal voice, you can get AI g
500500

501501
#### Text to speech avatar
502502

503-
- You can now set a static background image for your avatars. To utilize this feature, simply use the `avatarConfig.backgroundImage` property and specify a URL pointing to the desired image. For details, refer to [How to edit the background](../../text-to-speech-avatar/batch-synthesis-avatar-properties.md#how-to-edit-the-background).
503+
- You can now set a static background image for your avatars. To utilize this feature, simply use the `avatarConfig.backgroundImage` property and specify a URL pointing to the desired image. For details, refer to [batch synthesis avatar properties](../../text-to-speech-avatar/batch-synthesis-avatar-properties.md#edit-the-background).
504504

505505
### March 2024 release
506506

articles/ai-services/speech-service/text-to-speech-avatar/avatar-gestures-with-ssml.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@ description: Learn how to edit text to speech avatar gestures with SSML.
55
manager: nitinme
66
ms.service: azure-ai-speech
77
ms.topic: how-to
8-
ms.date: 4/28/2025
8+
ms.date: 08/07/2025
99
ms.reviewer: eur
1010
ms.author: eur
1111
author: eric-urban
1212
---
1313

1414
# Customize text to speech avatar gestures with SSML
1515

16-
The [Speech Synthesis Markup Language (SSML)](../speech-synthesis-markup-structure.md) with input text determines the structure, content, and other characteristics of the text to speech output. Most SSML tags can also work in text to speech avatar. Furthermore, text to speech avatar batch mode provides avatar gestures insertion ability by using the SSML bookmark element with the format `<bookmark mark='gesture.*'/>`.
16+
The [Speech Synthesis Markup Language (SSML)](../speech-synthesis-markup-structure.md) with input text determines the structure, content, and other characteristics of the text to speech output. Most SSML tags also work in text to speech avatar. Furthermore, text to speech avatar batch mode provides avatar gesture insertion by using the SSML bookmark element with the format `<bookmark mark='gesture.*'/>`.
1717

1818
A gesture starts at the insertion point in time. If the gesture takes more time than the audio, the gesture is cut at the point in time when the audio is finished.
1919

@@ -29,29 +29,29 @@ Hello <bookmark mark='gesture.wave-left-1'/>, my name is Ava, nice to meet you!
2929
</speak>
3030
```
3131

32-
In this example, the avatar will start waving their hand at the left after the word "Hello".
32+
In this example, the avatar starts waving their hand at the left after the word "Hello".
3333

34-
:::image type="content" source="./media/gesture.png" alt-text="Screenshot of displaying the standard avatar waving their hand at the left." lightbox="./media/gesture.png":::
34+
:::image type="content" source="./media/gesture.png" alt-text="Screenshot that shows the standard avatar waving their hand at the left." lightbox="./media/gesture.png":::
3535

3636
> [!NOTE]
37-
> Gesture feature is currently not supported when a voice sync for avatar is selected in a custom text to speech avatar.
37+
> Gesture feature isn't currently supported when a voice sync for avatar is selected in a custom text to speech avatar.
3838
3939
## Supported standard avatar characters, styles, and gestures
4040

4141
The full list of standard avatar supported gestures provided here can also be found in the text to speech avatar portal.
4242

43-
| Characters | Styles | Gestures |
44-
|------------|-------------------|-----------------------------|
43+
| Characters | Styles | Gestures |
44+
|------------|-------------------|------------------------------|
4545
| Harry | business | 123<br>calm-down<br>come-on<br>five-star-reviews<br>good<br>hello<br>introduce<br>invite<br>thanks<br>welcome |
4646
| Harry | casual | 123<br>come-on<br>five-star-reviews<br>gong-xi-fa-cai<br>good<br>happy-new-year<br>hello<br>please<br>welcome |
4747
| Harry | youthful | 123<br>come-on<br>down<br>five-star<br>good<br>hello<br>invite<br>show-right-up-down<br>welcome |
4848
| Jeff | business | 123<br>come-on<br>five-star-reviews<br>hands-up<br>here<br>meddle<br>please2<br>show<br>silence<br>thanks |
4949
| Jeff | formal | 123<br>come-on<br>five-star-reviews<br>lift<br>please<br>silence<br>thanks<br>very-good |
50-
| Lisa| casual-sitting | numeric1-left-1<br>numeric2-left-1<br>numeric3-left-1<br>thumbsup-left-1<br>show-front-1<br>show-front-2<br>show-front-3<br>show-front-4<br>show-front-5<br>think-twice-1<br>show-front-6<br>show-front-7<br>show-front-8<br>show-front-9 |
51-
| Lisa | graceful-sitting | wave-left-1<br>wave-left-2<br>thumbsup-left<br>show-left-1<br>show-left-2<br>show-left-3<br>show-left-4<br>show-left-5<br>show-right-1<br>show-right-2<br>show-right-3<br>show-right-4<br>show-right-5 |
52-
| Lisa | graceful-standing | |
53-
| Lisa | technical-sitting | wave-left-1<br>wave-left-2<br>show-left-1<br>show-left-2<br>point-left-1<br>point-left-2<br>point-left-3<br>point-left-4<br>point-left-5<br>point-left-6<br>show-right-1<br>show-right-2<br>show-right-3<br>point-right-1<br>point-right-2<br>point-right-3<br>point-right-4<br>point-right-5<br>point-right-6 |
54-
| Lisa | technical-standing |
50+
| Lisa | casual-sitting | numeric1-left-1<br>numeric2-left-1<br>numeric3-left-1<br>thumbsup-left-1<br>show-front-1<br>show-front-2<br>show-front-3<br>show-front-4<br>show-front-5<br>think-twice-1<br>show-front-6<br>show-front-7<br>show-front-8<br>show-front-9 |
51+
| Lisa | graceful-sitting | wave-left-1<br>wave-left-2<br>thumbsup-left<br>show-left-1<br>show-left-2<br>show-left-3<br>show-left-4<br>show-left-5<br>show-right-1<br>show-right-2<br>show-right-3<br>show-right-4<br>show-right-5 |
52+
| Lisa | graceful-standing | |
53+
| Lisa | technical-sitting | wave-left-1<br>wave-left-2<br>show-left-1<br>show-left-2<br>point-left-1<br>point-left-2<br>point-left-3<br>point-left-4<br>point-left-5<br>point-left-6<br>show-right-1<br>show-right-2<br>show-right-3<br>point-right-1<br>point-right-2<br>point-right-3<br>point-right-4<br>point-right-5<br>point-right-6 |
54+
| Lisa | technical-standing | |
5555
| Lori | casual | 123-left<br>a-little<br>beg<br>calm-down<br>come-on<br>five-star-reviews<br>good<br>hello<br>open<br>please<br>thanks |
5656
| Lori | graceful | 123-left<br>applaud<br>come-on<br>introduce<br>nod<br>please<br>show-left<br>show-right<br>thanks<br>welcome |
5757
| Lori | formal | 123<br>come-on<br>come-on-left<br>down<br>five-star<br>good<br>hands-triangle<br>hands-up<br>hi<br>hopeful<br>thanks |
@@ -62,11 +62,10 @@ The full list of standard avatar supported gestures provided here can also be fo
6262
| Meg | casual | a-little-bit<br>click-the-link<br>cross-hand<br>display-number<br>encourage-1<br>encourage-2<br>five-star-praise<br>front-left<br>front-right<br>good-1<br>good-2<br>handclap<br>introduction-to-products-1<br>introduction-to-products-2<br>introduction-to-products-3<br>left<br>length<br>lower-left<br>lower-right<br>number-one<br>press-both-hands-down<br>right<br>say-hi<br>shrug-ones-shoulders<br>slide-from-right-to-left<br>slide-to-the-left<br>spread-hands<br>the-front<br>top-middle-and-bottom-left<br>top-middle-and-bottom-right<br>upper-left<br>upper-right |
6363
| Meg | business | a-little-bit<br>encourage-1<br>encourage-2<br>five-star-praise<br>front-left<br>front-right<br>good-1<br>good-2<br>introduction-to-products-1<br>introduction-to-products-2<br>introduction-to-products-3<br>left<br>length<br>number-one<br>press-both-hands-down-1<br>press-both-hands-down-2<br>raise-ones-hand<br>right<br>say-hi<br>shrug-ones-shoulders<br>slide-from-left-to-right<br>slide-to-the-left<br>spread-hands<br>thanks<br>the-front<br>upper-left |
6464

65-
All styles except `lisa-graceful-sitting`, `lisa-graceful-standing`, `lisa-technical-sitting`, and `lisa-technical-standing` are supported via the real-time text to speech API. Gestures are only supported with the batch synthesis API and aren't supported via the real-time API.
65+
All styles except `lisa-graceful-sitting`, `lisa-graceful-standing`, `lisa-technical-sitting`, and `lisa-technical-standing` are supported via the real-time text to speech API. Gestures are only supported with the batch synthesis API and aren't supported via the real-time API.
6666

6767
## Next steps
6868

6969
* [What is text to speech avatar](what-is-text-to-speech-avatar.md)
7070
* [Real-time synthesis](./real-time-synthesis-avatar.md)
7171
* [Use batch synthesis for text to speech avatar](./batch-synthesis-avatar.md)
72-

0 commit comments

Comments
 (0)