Skip to content

Commit 941b9e5

Browse files
authored
Merge pull request #1838 from MicrosoftDocs/main
Merge main to live, 4 AM
2 parents b9cb3f8 + 7247c86 commit 941b9e5

16 files changed

+375
-58
lines changed

articles/ai-services/speech-service/text-to-speech-avatar/custom-avatar-create.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ You must provide a video file with a recorded statement from your avatar talent,
2323

2424
You can find the verbal consent statement in multiple languages on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/customavatar/verbal-statement-all-locales.txt). The language of the verbal statement must be the same as your recording. See also the disclosure for voice talent.
2525

26+
For more information about recording the consent video, see [How to record video samples](custom-avatar-record-video-samples.md).
27+
2628
## Prepare training data for custom text to speech avatar
2729

2830
You're required to provide video recordings of the avatar talent speaking in a language of your choice. The video recordings should contain high signal-to-noise ratio voice. The voice in the video recording isn't used as training data for a custom neural voice; its purpose is to train the custom text to speech avatar model.

articles/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,14 @@ The custom text to speech avatar doesn't support customization of clothes or loo
6060

6161
## What video clips to record
6262

63-
You need three types of basic video clips:
63+
You need four types of basic video clips:
64+
65+
**Consent Video:**
66+
- The consent video must represent the same avatar talent speaking, following the requirement of the consent statement. Make sure the statement is correctly recorded, and each word is clearly spoken. [Get consent file from the avatar talent](custom-avatar-create.md#get-consent-file-from-the-avatar-talent). You can select any one of the languages supported.
67+
- The avatar talent should always face the front of the camera, without large movements.
68+
- The video should be taken in a quiet environment, and the voice should be recorded at a reasonable volume. Try to keep the signal-to-noise ratio higher than 20. For voice recording guidance, see the [Recording custom voice samples](../record-custom-voice-samples.md#recording-your-script) guide.
69+
- Ensure that the head part will not be occluded in each frame of the video.
70+
- Make sure no other objects appear in the camera, including filming equipment, mobile phone, etc.
6471

6572
**Status 0 speaking:**
6673
- Status 0 represents the posture you can naturally maintain most of the time while speaking. For example, arms crossed in front of the body or hanging down naturally at the sides.
1.85 KB
Loading
69.2 KB
Loading
75.6 KB
Loading
Binary file not shown.
-107 KB
Loading
Binary file not shown.
-73.2 KB
Loading
80.5 KB
Loading

0 commit comments

Comments
 (0)