You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/how-to/professional-voice/train-voice/bilingual-training.md
+5-7Lines changed: 5 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,19 +9,17 @@ ms.date: 2/18/2024
9
9
ms.custom: include
10
10
---
11
11
12
-
If you selected the [Neural](?tabs=neural) training type, you can train a voice to speak in multiple languages. The `zh-CN` and `zh-TW` locales both support bilingual training for the voice to speak both Chinese and English. Depending in part on your training data, the synthesized voice can speak English with an English accent or English with the same native accent as the training data.
12
+
If you select the [Neural](?tabs=neural) training type, you can train a voice to speak in multiple languages. The `zh-CN` and `zh-TW` locales both support bilingual training for the voice to speak both Chinese and English. Depending in part on your training data, the synthesized voice can speak English with an English native accent or English with the same accent as the training data.
13
13
14
14
> [!NOTE]
15
-
> However, for a voice in the `zh-CN` locale to speak English with the native accent, you must select `Chinese (Mandarin, Simplified), English bilingual` (or `zh-CN (English bilingual)` via REST API) when creating a project.
16
-
17
-
If you want the voice to speak English with native accent, then at least 10% of the training dataset must be in English. Moreover, the 10% threshold is calculated based on the data accepted after successful uploading, not the data before uploading. If some uploaded English data is rejected due to defects and doesn't meet the 10% threshold, the synthesized voice will default to an English native accent.
15
+
> To enable a voice in the `zh-CN` locale to speak English with the same accent as the sample data, you should choose `Chinese (Mandarin, Simplified), English bilingual` when creating a project or specify the `zh-CN (English bilingual)` locale for the training set data via REST API.
18
16
19
17
The following table shows the differences between the two locales:
20
18
21
19
| Speech Studio locale | REST API locale | Bilingual support |
|`Chinese (Mandarin, Simplified)`|`zh-CN`| English with English accent is the default.<br/><br/>English with native accent isn't available, regardless of your training data. |
24
-
|`Chinese (Mandarin, Simplified), English bilingual`|`zh-CN (English bilingual)`| English with English accent is the default.<br/><br/>If you want the voice to speak English with native accent, then at least 10% of the training dataset must be in English. |
25
-
|`Chinese (Taiwanese Mandarin, Traditional)`|`zh-TW`| English with English accent is the default.<br/><br/>If you want the voice to speak English with native accent, then at least 10% of the training dataset must be in English. |
21
+
|`Chinese (Mandarin, Simplified)`|`zh-CN`|If your sample data includes English, the synthesized voice will speak English with an English native accent, instead of the same accent as the sample data, regardless of the amount of English data. |
22
+
|`Chinese (Mandarin, Simplified), English bilingual`|`zh-CN (English bilingual)`|If you want the synthesized voice to speak English with the same accent as the sample data, we recommend including over 10% English data in your training set. Otherwise, the English speaking accent may not be ideal.|
23
+
|`Chinese (Taiwanese Mandarin, Traditional)`|`zh-TW`|If you want to train a synthesized voice capable of speaking English with the same accent as your sample data, make sure to provide over 10% English data in your training set. Otherwise, it will default to an English native accent. The 10% threshold is calculated based on the data accepted after successful uploading, not the data before uploading. If some uploaded English data is rejected due to defects and doesn't meet the 10% threshold, the synthesized voice will default to an English native accent. |
0 commit comments