Skip to content

Commit 1d3271c

Browse files
Merge pull request #266627 from eric-urban/eur/cnv-updates
cnv training updates
2 parents 0e5c712 + 5559d16 commit 1d3271c

File tree

13 files changed

+121
-84
lines changed

13 files changed

+121
-84
lines changed

articles/ai-services/speech-service/includes/how-to/professional-voice/create-consent/rest.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
With the professional voice feature, it's required that every voice be created with explicit consent from the user. A recorded statement from the user is required acknowledging that the customer (Azure AI Speech resource owner) will create and use their voice.

articles/ai-services/speech-service/includes/how-to/professional-voice/create-consent/speech-studio.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
A voice talent is an individual or target speaker whose voices are recorded and used to create neural voice models.

articles/ai-services/speech-service/includes/how-to/professional-voice/create-project/rest.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
Professional voice projects contain the voice talent consent statement, training datasets, voice models, and endpoints.

articles/ai-services/speech-service/includes/how-to/professional-voice/create-project/speech-studio.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
Content for [Custom neural voice](https://aka.ms/customvoice) like data, models, tests, and endpoints are organized into projects in Speech Studio. Each project is specific to a country/region and language, and the gender of the voice you want to create. For example, you might create a project for a female voice for your call center's chat bots that use English in the United States.

articles/ai-services/speech-service/includes/how-to/professional-voice/create-training-set/rest.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
You need a training dataset to create a professional voice. A training dataset includes audio and script files. The audio files are recordings of the voice talent reading the script files. The script files are the text of the audio files.

articles/ai-services/speech-service/includes/how-to/professional-voice/create-training-set/speech-studio.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
When you're ready to create a custom text to speech voice for your application, the first step is to gather audio recordings and associated scripts to start training the voice model. For details on recording voice samples, see [the tutorial](../../../../record-custom-voice-samples.md). The Speech service uses this data to create a unique voice tuned to match the voice in the recordings. After you've trained the voice, you can start synthesizing speech in your applications.

articles/ai-services/speech-service/includes/how-to/professional-voice/deploy-endpoint/rest.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
After you've successfully created and [trained](../../../../professional-voice-train-voice.md) your voice model, you deploy it to a custom neural voice endpoint.

articles/ai-services/speech-service/includes/how-to/professional-voice/deploy-endpoint/speech-studio.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 12/1/2023
9+
ms.custom: include
1010
---
1111

1212
After you've successfully created and [trained](../../../../professional-voice-train-voice.md) your voice model, you deploy it to a custom neural voice endpoint.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 2/18/2024
9+
ms.custom: include
10+
---
11+
12+
If you select the [Neural](?tabs=neural) training type, you can train a voice to speak in multiple languages. The `zh-CN` and `zh-TW` locales both support bilingual training for the voice to speak both Chinese and English. Depending in part on your training data, the synthesized voice can speak English with an English native accent or English with the same accent as the training data.
13+
14+
> [!NOTE]
15+
> To enable a voice in the `zh-CN` locale to speak English with the same accent as the sample data, you should choose `Chinese (Mandarin, Simplified), English bilingual` when creating a project or specify the `zh-CN (English bilingual)` locale for the training set data via REST API.
16+
17+
The following table shows the differences between the two locales:
18+
19+
| Speech Studio locale | REST API locale | Bilingual support |
20+
|:------------- |:------- |:-------------------------- |
21+
| `Chinese (Mandarin, Simplified)` | `zh-CN` |If your sample data includes English, the synthesized voice speaks English with an English native accent, instead of the same accent as the sample data, regardless of the amount of English data. |
22+
| `Chinese (Mandarin, Simplified), English bilingual` | `zh-CN (English bilingual)` |If you want the synthesized voice to speak English with the same accent as the sample data, we recommend including over 10% English data in your training set. Otherwise, the English speaking accent might not be ideal. |
23+
| `Chinese (Taiwanese Mandarin, Traditional)` | `zh-TW` | If you want to train a synthesized voice capable of speaking English with the same accent as your sample data, make sure to provide over 10% English data in your training set. Otherwise, it defaults to an English native accent. The 10% threshold is calculated based on the data accepted after successful uploading, not the data before uploading. If some uploaded English data is rejected due to defects and doesn't meet the 10% threshold, the synthesized voice defaults to an English native accent. |
24+
25+

articles/ai-services/speech-service/includes/how-to/professional-voice/train-voice/rest.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: include file
3-
description: include file
4-
author: eric-urban
5-
ms.author: eur
6-
ms.service: azure-ai-speech
7-
ms.topic: include
8-
ms.date: 12/1/2023
9-
ms.custom: include
2+
title: include file
3+
description: include file
4+
author: eric-urban
5+
ms.author: eur
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 2/18/2024
9+
ms.custom: include
1010
---
1111

1212

@@ -45,7 +45,7 @@ To create a neural voice, use the [Models_Create](/rest/api/speechapi/models/cre
4545
- Set the required `projectId` property. See [create a project](../../../../professional-voice-create-project.md).
4646
- Set the required `consentId` property. See [add voice talent consent](../../../../professional-voice-create-consent.md).
4747
- Set the required `trainingSetId` property. See [create a training set](../../../../professional-voice-create-training-set.md).
48-
- Set the required recipe `kind` property to `Default` for neural voice training. The recipe kind indicates the training method and can't be changed later. To use a different training method, see [Neural - cross lingual](?tabs=crosslingual#create-a-voice-model) or [Neural - multi style](?tabs=multistyle#create-a-voice-model).
48+
- Set the required recipe `kind` property to `Default` for neural voice training. The recipe kind indicates the training method and can't be changed later. To use a different training method, see [Neural - cross lingual](?tabs=crosslingual#create-a-voice-model) or [Neural - multi style](?tabs=multistyle#create-a-voice-model). See [Bilingual training](#bilingual-training) for more information about bilingual training and differences between locales.
4949
- Set the required `voiceName` property. The voice name must end with "Neural" and can't be changed later. Choose a name carefully. The voice name is used in your [speech synthesis request](../../../../professional-voice-deploy-endpoint.md#use-your-custom-voice) by the SDK and SSML input. Only letters, numbers, and a few punctuation characters are allowed. Use different names for different neural voice models.
5050
- Optionally, set the `description` property for the voice description. The voice description can be changed later.
5151

@@ -227,6 +227,12 @@ You should receive a response body in the following format:
227227
}
228228
```
229229

230+
---
231+
232+
### Bilingual training
233+
234+
[!INCLUDE [Bilingual training](./bilingual-training.md)]
235+
230236
## Available preset styles across different languages
231237

232238
The following table summarizes the different preset styles according to different languages.

0 commit comments

Comments
 (0)