You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/how-to-custom-voice-prepare-data.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,14 +64,14 @@ To produce a good voice model, create the recordings in a quiet room with a high
64
64
65
65
### Audio files
66
66
67
-
Each audio file should contain a single utterance (a single sentence or a single turn of a dialog system), less than 15 seconds long. All files must be in the same spoken language. Multi-language custom Text-to-Speech voices aren't supported, with the exception of the Chinese-English bi-lingual. Each audio file must have a unique numeric filename with the filename extension .wav.
67
+
Each audio file should contain a single utterance (a single sentence or a single turn of a dialog system), less than 15 seconds long. All files must be in the same spoken language. Multi-language custom Text-to-Speech voices aren't supported, with the exception of the Chinese-English bi-lingual. Each audio file must have a unique filename with the filename extension .wav.
68
68
69
69
Follow these guidelines when preparing audio.
70
70
71
71
| Property | Value |
72
72
| -------- | ----- |
73
73
| File format | RIFF (.wav), grouped into a .zip file |
74
-
| File name |Numeric, with .wav extension.No duplicate file names allowed. |
74
+
| File name |File name characters supported by Windows OS, with .wav extension.<br>The characters \ / : * ? " < > \| aren't allowed. <br>It can't start or end with a space, and can't start with a dot. <br>No duplicate file names allowed. |
75
75
| Sampling rate | For creating a custom neural voice, 24,000 Hz is required. |
76
76
| Sample format | PCM, at least 16-bit |
77
77
| Audio length | Shorter than 15 seconds |
@@ -115,7 +115,7 @@ Follow these guidelines when preparing audio for segmentation.
115
115
| Property | Value |
116
116
| -------- | ----- |
117
117
| File format | RIFF (.wav) or .mp3, grouped into a .zip file |
118
-
| File name |ASCII and Unicode characters supported. No duplicate names allowed. |
118
+
| File name | File name characters supported by Windows OS, with .wav extension. <br>The characters \ / : * ? " < > \| aren't allowed. <br>It can't start or end with a space, and can't start with a dot. <br>No duplicate file names allowed. |
119
119
| Sampling rate | For creating a custom neural voice, 24,000 Hz is required. |
120
120
| Sample format |RIFF(.wav): PCM, at least 16-bit<br>mp3: at least 256 KBps bit rate|
121
121
| Audio length | Longer than 20 seconds |
@@ -155,7 +155,7 @@ Follow these guidelines when preparing audio.
155
155
| Property | Value |
156
156
| -------- | ----- |
157
157
| File format | RIFF (.wav) or .mp3, grouped into a .zip file |
158
-
| File name |ASCII and Unicode characters supported. No duplicate name allowed. |
158
+
| File name | File name characters supported by Windows OS, with .wav extension. <br>The characters \ / : * ? " < > \| aren't allowed. <br>It can't start or end with a space, and can't start with a dot. <br>No duplicate file names allowed. |
159
159
| Sampling rate | For creating a custom neural voice, 24,000 Hz is required. |
160
160
| Sample format |RIFF(.wav): PCM, at least 16-bit<br>mp3: at least 256 KBps bit rate|
0 commit comments