You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/record-custom-voice-samples.md
+6-5Lines changed: 6 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ ms.author: eur
13
13
14
14
# Recording voice samples for custom neural voice
15
15
16
-
This article provides you with instructions on preparing high-quality voice samples for creating a professional voice model using the custom neural voice Pro project.
16
+
This article provides you with instructions on preparing high-quality voice samples for creating a professional voice model using the custom neural voice Pro project. To understand how the data is processed and the minimum requirements for data acceptance, please refer to [upload your data](professional-voice-create-training-set.md#upload-your-data).
17
17
18
18
Creating a high-quality production custom neural voice from scratch isn't a casual undertaking. The central component of a custom neural voice is a large collection of audio samples of human speech. It's vital that these audio recordings be of high quality. Choose a voice talent who has experience making these kinds of recordings, and have them recorded by a recording engineer using professional equipment.
19
19
@@ -74,9 +74,10 @@ We provide [sample scripts in the 'General', 'Chat' and 'Customer Service' domai
74
74
75
75
Below are some general guidelines that you can follow to create a good corpus (recorded audio samples) for custom neural voice training.
76
76
77
-
- Balance your script to cover different sentence types in your domain including statements, questions, exclamations, long sentences, and short sentences.
78
-
79
-
Each sentence is recommended to be between 2 and 15 seconds long, with 5 to 30 words for Latin-based languages or 4 to 80 words for non-Latin languages. Ensure your script does not include any duplicate sentences.<br>
77
+
- For most use cases, sentences are recommended to be between 2 and 15 seconds long, containing 5 to 30 words for Latin-based languages or 4 to 80 words for non-Latin languages. Aim to balance your script to include a variety of sentence types and lengths. Ensure your script does not include any duplicate sentences.<br>
78
+
79
+
If your use case requires a high emphasis on questions, exclamations, or a mix of particularly long and short sentences, it is recommended to include a good portion of sentences as questions or exclamations, along with very short phrases and longer phrases up to 20 seconds in length.
80
+
80
81
For how to balance the different sentence types, refer to the following table:
81
82
82
83
| Sentence types | Coverage |
@@ -89,7 +90,7 @@ Below are some general guidelines that you can follow to create a good corpus (r
89
90
> [!NOTE]
90
91
> Short words/phrases should be separated with a commas. They help remind your voice talent to pause briefly when reading them.
91
92
>
92
-
> You can estimate the number of words in a sentence by assuming a speech rate in words per second based on your language. And the range can be extended to 1 to 100 words to better accommodate short or long sentence scenarios.
93
+
> You can estimate the number of words in a sentence by assuming a speech rate in words per second based on your language.
93
94
>
94
95
> Question and exclamations are required if you want the generated voice to accurately convey questions or exclamations.
0 commit comments