You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/record-custom-voice-samples.md
+11-3Lines changed: 11 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -103,13 +103,21 @@ Below are some general guidelines that you can follow to create a good corpus (r
103
103
104
104
With that, make sure your voice talent pronounces these words in the expected way. Keep your script and recordings match consistently during the training process.
105
105
106
-
> [!NOTE]
107
-
> The scripts prepared for your voice talent need to follow the native reading conventions, such as 50% and $45, while the scripts used for training need to be normalized to make sure that the scripts match the audio content, such as *fifty percent* and *forty-five dollars*. Check the scripts used for training against the recordings of your voice talent, to make sure they match.
108
-
109
106
- Your script should include many different words and sentences with different kinds of sentence lengths, structures, and moods.
110
107
111
108
- Check the script carefully for errors. If possible, have someone else check it too. When you run through the script with your talent, you'll probably catch a few more mistakes.
112
109
110
+
### Difference between script for voice talent and script for training
111
+
112
+
The sample scripts we provided on [GitHub](https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/CustomVoice/script) are just defined for voice talent. If you use the sample scripts to upload for traning, you must normalize them in their spoken form. The scripts prepared for voice talent need to follow the native reading conventions, such as 50% and $45, while the scripts used for training need to be normalized to make sure that the scripts match the audio content, such as *fifty percent* and *forty-five dollars*. Make sure the scripts used for training match the recordings of your voice talent, especially scripts contaning digits, symbols, abbreviation, date, and time. We provide a few examples of text normalization rules and explain the difference between script for voice talent and script for training.
113
+
114
+
| Category |Script for voice talent<br> (non-normalized) | Script for training <br> (normalized) |
0 commit comments