You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/how-to-migrate-to-custom-neural-voice.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,10 +19,12 @@ ms.author: v-baolianzou
19
19
20
20
The custom neural voice lets you build higher-quality voice models while requiring less data. You can develop more realistic, natural, and conversational voices. Your customers and end users will benefit from the latest Text-to-Speech technology, in a responsible way.
21
21
22
-
|Custom voice |Custom neural voice |
22
+
|Custom voice |Custom neural voice |
23
23
|--|--|
24
24
| The standard, or "traditional," method of custom voice breaks down spoken language into phonetic snippets that can be remixed and matched using classical programming or statistical methods. | Custom neural voice synthesizes speech using deep neural networks that have "learned" the way phonetics are combined in natural human speech rather than using classical programming or statistical methods.|
25
-
| Custom voice requires a large volume of voice data to produce a more human-like voice model. With fewer recorded lines, a standard custom voice model will tend to sound more obviously robotic. |The custom neural voice capability enables you to create a unique brand voice in multiple languages and styles by using a small set of recordings.|
25
+
| Custom voice<sup>1</sup> requires a large volume of voice data to produce a more human-like voice model. With fewer recorded lines, a standard custom voice model will tend to sound more obviously robotic. |The custom neural voice capability enables you to create a unique brand voice in multiple languages and styles by using a small set of recordings.|
26
+
27
+
<sup>1</sup> When creating a custom voice model, the maximum number of data files allowed to be imported per subscription is 10 .zip files for free subscription (F0) users, and 500 for standard subscription (S0) users.
26
28
27
29
## Action required
28
30
@@ -41,7 +43,7 @@ Before you can migrate to custom neural voice, your [application](https://aka.ms
41
43
3. After the custom neural voice model is created, deploy the voice model to a new endpoint. To create a new custom voice endpoint with your neural voice model, go to **Text-to-Speech > Custom Voice > Deploy model**. Select **Deploy models** and enter a **Name** and **Description** for your custom endpoint. Then select the custom neural voice model you would like to associate with this endpoint and confirm the deployment.
42
44
4. Update your code in your apps if you have created a new endpoint with a new model.
43
45
44
-
## Custom voice details (retired)
46
+
## Custom voice details (deprecated)
45
47
46
48
Read the following sections for details on custom voice.
47
49
@@ -90,6 +92,7 @@ If you've created a custom voice font, use the endpoint that you've created. You
90
92
| West US |`https://westus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId}`|
91
93
| West US 2 |`https://westus2.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId}`|
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/how-to-migrate-to-prebuilt-neural-voice.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ The prebuilt neural voice provides more natural sounding speech output, and thus
35
35
1. Review the [price](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/) structure and listen to the neural voice [samples](https://azure.microsoft.com/services/cognitive-services/text-to-speech/#overview) at the bottom of that page to determine the right voice for your business needs.
36
36
2. To make the change, [follow the sample code](speech-synthesis-markup.md#choose-a-voice-for-text-to-speech) to update the voice name in your speech synthesis request to the supported neural voice names in chosen languages. Please use neural voices for your speech synthesis request, on cloud or on prem. For on-prem container, please use the [neural voice containers](../containers/container-image-tags.md) and follow the [instructions](speech-container-howto.md).
37
37
38
-
## Standard voice details (retired)
38
+
## Standard voice details (deprecated)
39
39
40
40
Read the following sections for details on standard voice.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/text-to-speech-container-query-endpoint.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ ms.author: eur
12
12
13
13
The container provides [REST-based endpoint APIs](../rest-text-to-speech.md). Many [sample source code projects](https://github.com/Azure-Samples/Cognitive-Speech-TTS) for platform, framework, and language variations are available.
14
14
15
-
With the standard or neural text-to-speech containers, you should rely on the locale and voice of the image tag you downloaded. For example, if you downloaded the `latest` tag, the default locale is `en-US` and the `AriaNeural` voice. The `{VOICE_NAME}` argument would then be [`en-US-AriaNeural`](../language-support.md#prebuilt-neural-voices). See the following example SSML:
15
+
With the neural Text-to-Speech containers, you should rely on the locale and voice of the image tag you downloaded. For example, if you downloaded the `latest` tag, the default locale is `en-US` and the `AriaNeural` voice. The `{VOICE_NAME}` argument would then be [`en-US-AriaNeural`](../language-support.md#prebuilt-neural-voices). See the following example SSML:
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/speech-container-howto.md
+2-63Lines changed: 2 additions & 63 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,6 @@ With Speech containers, you can build a speech application architecture that's o
27
27
|--|--|--|--|
28
28
| Speech-to-text | Analyzes sentiment and transcribes continuous real-time speech or batch audio recordings with intermediate results. | 3.0.0 | Generally available |
29
29
| Custom speech-to-text | Using a custom model from the [Custom Speech portal](https://speech.microsoft.com/customspeech), transcribes continuous real-time speech or batch audio recordings into text with intermediate results. | 3.0.0 | Generally available |
30
-
| Text-to-speech | Converts text to natural-sounding speech with plain text input or Speech Synthesis Markup Language (SSML). | 1.15.0 | Generally available |
31
30
| Speech language identification | Detects the language spoken in audio files. | 1.5.0 | Preview |
32
31
| Neural text-to-speech | Converts text to natural-sounding speech by using deep neural network technology, which allows for more natural synthesized speech. | 2.0.0 | Generally available |
33
32
@@ -58,7 +57,6 @@ The following table describes the minimum and recommended allocation of resource
> The `locale` and `voice` for custom Speech containers is determined by the custom model ingested by the container.
172
164
173
-
# [Text-to-speech](#tab/tts)
174
-
175
-
#### Docker pull for the text-to-speech container
176
-
177
-
Use the [docker pull](https://docs.docker.com/engine/reference/commandline/pull/) command to download a container image from Microsoft Container Registry:
For all the supported locales and corresponding voices of the text-to-speech container, see [Text-to-speech image tags](../containers/container-image-tags.md#text-to-speech).
201
-
202
-
> [!IMPORTANT]
203
-
> When you construct a text-to-speech HTTP POST, the [SSML](speech-synthesis-markup.md) message requires a `voice` element with a `name` attribute. The value is the corresponding container locale and voice, which is also known as the [short name](how-to-migrate-to-prebuilt-neural-voice.md). For example, the `latest` tag would have a voice name of `en-US-AriaRUS`.
204
-
205
165
# [Neural text-to-speech](#tab/ntts)
206
166
207
167
#### Docker pull for the neural text-to-speech container
@@ -456,25 +416,6 @@ Checking available base model for en-us
456
416
457
417
Starting in v2.5.0 of the custom-speech-to-text container, you can get custom pronunciation results in the output. All you need to do is have your own custom pronunciation rules set up in your custom model and mount the model to a custom-speech-to-text container.
458
418
459
-
# [Text-to-speech](#tab/tts)
460
-
461
-
To run the standard text-to-speech container, execute the following `docker run` command:
* Runs a standard text-to-speech container from the container image.
474
-
* Allocates 1 CPU core and 2 GB of memory.
475
-
* Exposes TCP port 5000 and allocates a pseudo-TTY for the container.
476
-
* Automatically removes the container after it exits. The container image is still available on the host computer.
477
-
478
419
# [Neural text-to-speech](#tab/ntts)
479
420
480
421
To run the neural text-to-speech container, execute the following `docker run` command:
@@ -534,7 +475,7 @@ Increasing the number of concurrent calls can affect reliability and latency. Fo
534
475
| Containers | SDK Host URL | Protocol |
535
476
|--|--|--|
536
477
| Standard speech-to-text and custom speech-to-text |`ws://localhost:5000`| WS |
537
-
| Text-to-speech (including standard and neural), Speech language identification |`http://localhost:5000`| HTTP |
478
+
|Neural Text-to-speech, Speech language identification |`http://localhost:5000`| HTTP |
538
479
539
480
For more information on using WSS and HTTPS protocols, see [Container security](../cognitive-services-container-support.md#azure-cognitive-services-container-security).
0 commit comments