You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/quickstarts/text-to-speech/async-synthesis-long-form-audio.md
+46-25Lines changed: 46 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ ms.author: erhopf
14
14
15
15
# Quickstart: Asynchronous synthesis for long-form audio in Python (Preview)
16
16
17
-
In this quickstart, you'll use the Long Audio API to asynchronously convert text to speech, and retrieve the audio output from a URI provided by the service. This REST API is ideal for content providers that need to convert text files greater than 10,000 characters or 50 paragraphs into synthesized speech. For more information, see [Long Audio API](../../long-audio-api.md).
17
+
In this quickstart, you'll use the Long Audio API to asynchronously convert text to speech, and retrieve the audio output from a URI provided by the service. This REST API is ideal for content providers that need to synthesize audio from text greater than 5,000 character (or more than 10 minutes in length). For more information, see [Long Audio API](../../long-audio-api.md).
18
18
19
19
> [!NOTE]
20
20
> Asynchronous synthesis for long-form audio can only be used with [Custom Neural Voices](../../how-to-custom-voice.md#custom-neural-voices).
> If you haven't used these modules you'll need to install them before running your program. To install these packages, run: `pip install requests urllib3`.
48
+
> If you haven't used these modules, you'll need to install them before running your program. To install these packages, run: `pip install requests urllib3`.
49
49
50
50
These modules are used to parse arguments, construct the HTTP request, and call the text-to-speech long audio REST API.
51
51
52
52
## Get a list of supported voices
53
53
54
-
This code gets a list of available voices that you can use to convert text-to-speech. Add this code `voice_synthesis_client.py`:
54
+
This code gets a list of available voices that you can use to convert text-to-speech. Add the code to`voice_synthesis_client.py`:
55
55
56
56
```python
57
57
parser = argparse.ArgumentParser(description='Cris client tool to submit voice synthesis requests.')
@@ -75,13 +75,18 @@ if args.voices:
75
75
76
76
### Test your code
77
77
78
-
Let's test what you've done so far. Run this command, replacing `<your_key>` with your Speech subscription key, and `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
78
+
Let's test what you've done so far. You'll need to update a few things in the request below:
79
+
80
+
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
81
+
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
@@ -90,14 +95,17 @@ Name: Microsoft Server Speech Text to Speech Voice (en-US, xxx), Description: xx
90
95
Name: Microsoft Server Speech Text to Speech Voice (zh-CN, xxx), Description: xxx , Id: xxx, Locale: zh-CN, Gender: Female, PublicVoice: xxx, Created: 2019-08-26T04:55:39Z
91
96
```
92
97
93
-
## Convert text to speech
98
+
## Prepare input files
99
+
100
+
Prepare an input text file. It can be either plain text or SSML text. For the input file requirements, see how to [prepare content for synthesis](https://docs.microsoft.com/azure/cognitive-services/speech-service/long-audio-api#prepare-content-for-synthesis).
94
101
95
-
The next step is to prepare an input text file. It can be either plain text or SSML, but must be more than 10,000 character or 50 paragraphs. For a complete list of requirements, see [Long Audio API](../../long-audio-api.md).
102
+
## Convert text to speech
96
103
97
-
After you've prepared the text file. The next step is to add code for speech synthesis to your project. Add this code to `voice_synthesis_client.py`:
104
+
After preparing the input text file, add this code for speech synthesis to `voice_synthesis_client.py`:
98
105
99
106
> [!NOTE]
100
-
> By default, the audio output is set to riff-16khz-16bit-mono-pcm. For more information about supported audio outputs, see [Long Audio API](../../long-audio-api.md#audio-output-formats).
107
+
> 'concatenateResult' is an optional parameter. If this parameter isn't set, the audio outputs will be generated per paragraph. You can also concatenate the audios into 1 output by setting the parameter.
108
+
> By default, the audio output is set to riff-16khz-16bit-mono-pcm. For more information about supported audio outputs, see [Audio output formats](https://docs.microsoft.com/azure/cognitive-services/speech-service/long-audio-api#audio-output-formats).
101
109
102
110
```python
103
111
parser.add_argument('--submit', action="store_true", default=False, help='submit a synthesis request')
Let's try making a request to synthesize text using your input file as a source. You'll need to update a few things in the request below:
171
+
Let's make a request to synthesize text using your input file as the source. You'll need to update a few things in the request below:
164
172
165
173
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
166
174
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
167
-
* Replace `<input>` with the path to the text file you're looking to convert from text-to-speech.
175
+
* Replace `<input>` with the path to the text file you've prepared for text-to-speech.
168
176
* Replace `<locale>` with the desired output locale. For more information, see [language support](../../language-support.md#neural-voices).
169
-
* Replace `<voice_guid>` with the desired voice for the audio output. Use one of the voices returned by [Get a list of supported voices](#get-a-list-of-supported-voices) or use the list of neural voices provided in [language support](../../language-support.md#neural-voices).
177
+
* Replace `<voice_guid>` with the desired output voice. Use one of the voices returned by [Get a list of supported voices](#get-a-list-of-supported-voices).
> 'concatenateResult' is an optional parameter, if this parameter isn't provided, the output will be provided as multiple wave files, one for each line.
186
+
> If you have more than 1 input files, you will need to submit multiple requests. There are some limitations that needs to be aware.
187
+
> * The client is allowed to submit up to **5** requests to server per second for each Azure subscription account. If it exceeds the limitation, client will get a 429 error code(too many requests). Please reduce the request amount per second
188
+
> * The server is allowed to run and queue up to **120** requests for each Azure subscription account. If it exceeds the limitation, server will return a 429 error code(too many requests). Please wait and avoid submitting new request until some requests are completed
179
189
180
-
You should get an output that looks like this:
190
+
You'll see an output that looks like this:
181
191
182
192
```console
183
193
Submit synthesis request successful
@@ -195,13 +205,13 @@ Checking status
195
205
Succeeded... Result file downloaded : xxxx.zip
196
206
```
197
207
198
-
The result provided contains the input text and the audio output files generated by the service. These are downloaded as a zip.
208
+
The result contains the input text and the audio output files that are generated by the service. You can download these files in a zip.
199
209
200
210
## Remove previous requests
201
211
202
-
There is a limit of 2,000 requests for each subscription. As such, there will be times that you need to remove previously submitted requests before you can make new ones. If you don't remove existing requests, you'll receive an error when you exceed 2,000.
212
+
The server will keep up to **20,000** requests for each Azure subscription account. If your request amount exceeds this limitation, please remove previous requests before making new ones. If you don't remove existing requests, you'll receive an error notification.
Run this command, replacing `<your_key>` with your Speech subscription key, and `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
247
+
Now, let's check to see what requests you've previously submitted. Before you continue, you'll need to update a few things in this request:
248
+
249
+
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
250
+
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
This will return a list of syntheses you've requested. You should get an output that looks like this:
258
+
This will return a list of synthesis requests that you've made. You'll see an output like this:
244
259
245
260
```console
246
261
There are <number> synthesis requests submitted:
@@ -249,16 +264,22 @@ ID : xxx , Name : xxx, Status : Running
249
264
ID : xxx , Name : xxx : Succeeded
250
265
```
251
266
252
-
Now let's use some of these values to remove/delete previously submitted requests. Run this command, replacing `<your_key>` with your Speech subscription key, and `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal). The `<synthesis_id>` should be one of the values returned in the previous request.
267
+
Now, let's remove a previously submitted request. You'll need to update a few things in the code below:
268
+
269
+
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
270
+
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
271
+
* Replace `<synthesis_id>` with the value returned in the previous request.
253
272
254
273
> [!NOTE]
255
274
> Requests with a status of ‘Running’/'Waiting' cannot be removed or deleted.
The complete`voice_synthesis_client.py` is available for download on [GitHub](https://github.com/Azure-Samples/Cognitive-Speech-TTS/blob/master/CustomVoice-API-Samples/Python/voiceclient.py).
291
+
The completed`voice_synthesis_client.py` is available for download on [GitHub](https://github.com/Azure-Samples/Cognitive-Speech-TTS/blob/master/CustomVoice-API-Samples/Python/voiceclient.py).
0 commit comments