Skip to content

Commit a9a541e

Browse files
authored
Merge pull request #100583 from Yueying-Liu/patch-12
[Cog Svcs] Update async-synthesis-long-form-audio.md
2 parents 6f1cace + e43f40f commit a9a541e

File tree

1 file changed

+46
-25
lines changed

1 file changed

+46
-25
lines changed

articles/cognitive-services/Speech-Service/quickstarts/text-to-speech/async-synthesis-long-form-audio.md

Lines changed: 46 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ ms.author: erhopf
1414

1515
# Quickstart: Asynchronous synthesis for long-form audio in Python (Preview)
1616

17-
In this quickstart, you'll use the Long Audio API to asynchronously convert text to speech, and retrieve the audio output from a URI provided by the service. This REST API is ideal for content providers that need to convert text files greater than 10,000 characters or 50 paragraphs into synthesized speech. For more information, see [Long Audio API](../../long-audio-api.md).
17+
In this quickstart, you'll use the Long Audio API to asynchronously convert text to speech, and retrieve the audio output from a URI provided by the service. This REST API is ideal for content providers that need to synthesize audio from text greater than 5,000 character (or more than 10 minutes in length). For more information, see [Long Audio API](../../long-audio-api.md).
1818

1919
> [!NOTE]
2020
> Asynchronous synthesis for long-form audio can only be used with [Custom Neural Voices](../../how-to-custom-voice.md#custom-neural-voices).
@@ -45,13 +45,13 @@ urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
4545
```
4646

4747
> [!NOTE]
48-
> If you haven't used these modules you'll need to install them before running your program. To install these packages, run: `pip install requests urllib3`.
48+
> If you haven't used these modules, you'll need to install them before running your program. To install these packages, run: `pip install requests urllib3`.
4949
5050
These modules are used to parse arguments, construct the HTTP request, and call the text-to-speech long audio REST API.
5151

5252
## Get a list of supported voices
5353

54-
This code gets a list of available voices that you can use to convert text-to-speech. Add this code `voice_synthesis_client.py`:
54+
This code gets a list of available voices that you can use to convert text-to-speech. Add the code to `voice_synthesis_client.py`:
5555

5656
```python
5757
parser = argparse.ArgumentParser(description='Cris client tool to submit voice synthesis requests.')
@@ -75,13 +75,18 @@ if args.voices:
7575

7676
### Test your code
7777

78-
Let's test what you've done so far. Run this command, replacing `<your_key>` with your Speech subscription key, and `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
78+
Let's test what you've done so far. You'll need to update a few things in the request below:
79+
80+
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
81+
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
82+
83+
Run this command:
7984

8085
```console
8186
python voice_synthesis_client.py --voices -key <your_key> -region <Region>
8287
```
8388

84-
You should get an output that looks like this:
89+
You'll see an output that looks like this:
8590

8691
```console
8792
There are xx voices available:
@@ -90,14 +95,17 @@ Name: Microsoft Server Speech Text to Speech Voice (en-US, xxx), Description: xx
9095
Name: Microsoft Server Speech Text to Speech Voice (zh-CN, xxx), Description: xxx , Id: xxx, Locale: zh-CN, Gender: Female, PublicVoice: xxx, Created: 2019-08-26T04:55:39Z
9196
```
9297

93-
## Convert text to speech
98+
## Prepare input files
99+
100+
Prepare an input text file. It can be either plain text or SSML text. For the input file requirements, see how to [prepare content for synthesis](https://docs.microsoft.com/azure/cognitive-services/speech-service/long-audio-api#prepare-content-for-synthesis).
94101

95-
The next step is to prepare an input text file. It can be either plain text or SSML, but must be more than 10,000 character or 50 paragraphs. For a complete list of requirements, see [Long Audio API](../../long-audio-api.md).
102+
## Convert text to speech
96103

97-
After you've prepared the text file. The next step is to add code for speech synthesis to your project. Add this code to `voice_synthesis_client.py`:
104+
After preparing the input text file, add this code for speech synthesis to `voice_synthesis_client.py`:
98105

99106
> [!NOTE]
100-
> By default, the audio output is set to riff-16khz-16bit-mono-pcm. For more information about supported audio outputs, see [Long Audio API](../../long-audio-api.md#audio-output-formats).
107+
> 'concatenateResult' is an optional parameter. If this parameter isn't set, the audio outputs will be generated per paragraph. You can also concatenate the audios into 1 output by setting the parameter.
108+
> By default, the audio output is set to riff-16khz-16bit-mono-pcm. For more information about supported audio outputs, see [Audio output formats](https://docs.microsoft.com/azure/cognitive-services/speech-service/long-audio-api#audio-output-formats).
101109
102110
```python
103111
parser.add_argument('--submit', action="store_true", default=False, help='submit a synthesis request')
@@ -118,7 +126,7 @@ def submitSynthesis():
118126
files = {'script': (scriptfilename, open(args.file, 'rb'), 'text/plain')}
119127
response = requests.post(baseAddress+"voicesynthesis", data, headers={"Ocp-Apim-Subscription-Key":args.key}, files=files, verify=False)
120128
if response.status_code == 202:
121-
location = response.headers['Operation-Location']
129+
location = response.headers['Location']
122130
id = location.split("/")[-1]
123131
print("Submit synthesis request successful")
124132
return id
@@ -160,13 +168,13 @@ if args.submit:
160168

161169
### Test your code
162170

163-
Let's try making a request to synthesize text using your input file as a source. You'll need to update a few things in the request below:
171+
Let's make a request to synthesize text using your input file as the source. You'll need to update a few things in the request below:
164172

165173
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
166174
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
167-
* Replace `<input>` with the path to the text file you're looking to convert from text-to-speech.
175+
* Replace `<input>` with the path to the text file you've prepared for text-to-speech.
168176
* Replace `<locale>` with the desired output locale. For more information, see [language support](../../language-support.md#neural-voices).
169-
* Replace `<voice_guid>` with the desired voice for the audio output. Use one of the voices returned by [Get a list of supported voices](#get-a-list-of-supported-voices) or use the list of neural voices provided in [language support](../../language-support.md#neural-voices).
177+
* Replace `<voice_guid>` with the desired output voice. Use one of the voices returned by [Get a list of supported voices](#get-a-list-of-supported-voices).
170178

171179
Convert text to speech with this command:
172180

@@ -175,9 +183,11 @@ python voice_synthesis_client.py --submit -key <your_key> -region <Region> -file
175183
```
176184

177185
> [!NOTE]
178-
> 'concatenateResult' is an optional parameter, if this parameter isn't provided, the output will be provided as multiple wave files, one for each line.
186+
> If you have more than 1 input files, you will need to submit multiple requests. There are some limitations that needs to be aware.
187+
> * The client is allowed to submit up to **5** requests to server per second for each Azure subscription account. If it exceeds the limitation, client will get a 429 error code(too many requests). Please reduce the request amount per second
188+
> * The server is allowed to run and queue up to **120** requests for each Azure subscription account. If it exceeds the limitation, server will return a 429 error code(too many requests). Please wait and avoid submitting new request until some requests are completed
179189
180-
You should get an output that looks like this:
190+
You'll see an output that looks like this:
181191

182192
```console
183193
Submit synthesis request successful
@@ -195,13 +205,13 @@ Checking status
195205
Succeeded... Result file downloaded : xxxx.zip
196206
```
197207

198-
The result provided contains the input text and the audio output files generated by the service. These are downloaded as a zip.
208+
The result contains the input text and the audio output files that are generated by the service. You can download these files in a zip.
199209

200210
## Remove previous requests
201211

202-
There is a limit of 2,000 requests for each subscription. As such, there will be times that you need to remove previously submitted requests before you can make new ones. If you don't remove existing requests, you'll receive an error when you exceed 2,000.
212+
The server will keep up to **20,000** requests for each Azure subscription account. If your request amount exceeds this limitation, please remove previous requests before making new ones. If you don't remove existing requests, you'll receive an error notification.
203213

204-
Add this code to `voice_synthesis_client.py`:
214+
Add the code to `voice_synthesis_client.py`:
205215

206216
```python
207217
parser.add_argument('--syntheses', action="store_true", default=False, help='print synthesis list')
@@ -234,13 +244,18 @@ if args.delete:
234244

235245
### Test your code
236246

237-
Run this command, replacing `<your_key>` with your Speech subscription key, and `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
247+
Now, let's check to see what requests you've previously submitted. Before you continue, you'll need to update a few things in this request:
248+
249+
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
250+
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
251+
252+
Run this command:
238253

239254
```console
240-
python voice_synthesis_client.py syntheses -key <your_key> -region <Region>
255+
python voice_synthesis_client.py --syntheses -key <your_key> -region <Region>
241256
```
242257

243-
This will return a list of syntheses you've requested. You should get an output that looks like this:
258+
This will return a list of synthesis requests that you've made. You'll see an output like this:
244259

245260
```console
246261
There are <number> synthesis requests submitted:
@@ -249,16 +264,22 @@ ID : xxx , Name : xxx, Status : Running
249264
ID : xxx , Name : xxx : Succeeded
250265
```
251266

252-
Now let's use some of these values to remove/delete previously submitted requests. Run this command, replacing `<your_key>` with your Speech subscription key, and `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal). The `<synthesis_id>` should be one of the values returned in the previous request.
267+
Now, let's remove a previously submitted request. You'll need to update a few things in the code below:
268+
269+
* Replace `<your_key>` with your Speech service subscription key. This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
270+
* Replace `<region>` with the region where your Speech resource was created (for example: `eastus` or `westus`). This information is available in the **Overview** tab for your resource in the [Azure portal](https://aka.ms/azureportal).
271+
* Replace `<synthesis_id>` with the value returned in the previous request.
253272

254273
> [!NOTE]
255274
> Requests with a status of ‘Running’/'Waiting' cannot be removed or deleted.
256275
276+
Run this command:
277+
257278
```console
258-
python voice_synthesis_client.py delete -key <your_key> -region <Region> -synthesisId <synthesis_id>
279+
python voice_synthesis_client.py --delete -key <your_key> -region <Region> -synthesisId <synthesis_id>
259280
```
260281

261-
You should get an output that looks like this:
282+
You'll see an output like this:
262283

263284
```console
264285
delete voice synthesis xxx
@@ -267,7 +288,7 @@ delete successful
267288

268289
## Get the full client
269290

270-
The complete `voice_synthesis_client.py` is available for download on [GitHub](https://github.com/Azure-Samples/Cognitive-Speech-TTS/blob/master/CustomVoice-API-Samples/Python/voiceclient.py).
291+
The completed `voice_synthesis_client.py` is available for download on [GitHub](https://github.com/Azure-Samples/Cognitive-Speech-TTS/blob/master/CustomVoice-API-Samples/Python/voiceclient.py).
271292

272293
## Next steps
273294

0 commit comments

Comments
 (0)