Skip to content

Commit 5ac4cec

Browse files
authored
Merge pull request #211210 from eric-urban/patch-1
Update rest-text-to-speech.md
2 parents d6a1b4c + be59bb0 commit 5ac4cec

File tree

1 file changed

+48
-31
lines changed

1 file changed

+48
-31
lines changed

articles/cognitive-services/Speech-Service/rest-text-to-speech.md

Lines changed: 48 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -272,37 +272,54 @@ If the HTTP status is `200 OK`, the body of the response contains an audio file
272272

273273
## Audio outputs
274274

275-
This is a list of supported audio formats that are sent in each request as the `X-Microsoft-OutputFormat` header. Each format incorporates a bit rate and encoding type. The Speech service supports 48-kHz, 24-kHz, 16-kHz, and 8-kHz audio outputs. Prebuilt neural voices are created from samples that use a 24-khz sample rate. All voices can upsample or downsample to other sample rates when synthesizing.
276-
277-
| Streaming | Non-Streaming |
278-
| ---------------------------------- | --------------------------- |
279-
| audio-16khz-16bit-32kbps-mono-opus | riff-8khz-8bit-mono-alaw |
280-
| audio-16khz-32kbitrate-mono-mp3 | riff-8khz-8bit-mono-mulaw |
281-
| audio-16khz-64kbitrate-mono-mp3 | riff-8khz-16bit-mono-pcm |
282-
| audio-16khz-128kbitrate-mono-mp3 | riff-22050hz-16bit-mono-pcm |
283-
| audio-24khz-16bit-24kbps-mono-opus | riff-24khz-16bit-mono-pcm |
284-
| audio-24khz-16bit-48kbps-mono-opus | riff-44100hz-16bit-mono-pcm |
285-
| audio-24khz-48kbitrate-mono-mp3 | riff-48khz-16bit-mono-pcm |
286-
| audio-24khz-96kbitrate-mono-mp3 | |
287-
| audio-24khz-160kbitrate-mono-mp3 | |
288-
| audio-48khz-96kbitrate-mono-mp3 | |
289-
| audio-48khz-192kbitrate-mono-mp3 | |
290-
| ogg-16khz-16bit-mono-opus | |
291-
| ogg-24khz-16bit-mono-opus | |
292-
| ogg-48khz-16bit-mono-opus | |
293-
| raw-8khz-8bit-mono-alaw | |
294-
| raw-8khz-8bit-mono-mulaw | |
295-
| raw-8khz-16bit-mono-pcm | |
296-
| raw-16khz-16bit-mono-pcm | |
297-
| raw-16khz-16bit-mono-truesilk | |
298-
| raw-22050hz-16bit-mono-pcm | |
299-
| raw-24khz-16bit-mono-pcm | |
300-
| raw-24khz-16bit-mono-truesilk | |
301-
| raw-44100hz-16bit-mono-pcm | |
302-
| raw-48khz-16bit-mono-pcm | |
303-
| webm-16khz-16bit-mono-opus | |
304-
| webm-24khz-16bit-24kbps-mono-opus | |
305-
| webm-24khz-16bit-mono-opus | |
275+
The supported streaming and non-streaming audio formats are sent in each request as the `X-Microsoft-OutputFormat` header. Each format incorporates a bit rate and encoding type. The Speech service supports 48-kHz, 24-kHz, 16-kHz, and 8-kHz audio outputs. Prebuilt neural voices are created from samples that use a 24-khz sample rate. All voices can upsample or downsample to other sample rates when synthesizing.
276+
277+
#### [Streaming](#tab/streaming)
278+
279+
```
280+
amr-wb-16000hz
281+
audio-16khz-16bit-32kbps-mono-opus
282+
audio-16khz-32kbitrate-mono-mp3
283+
audio-16khz-64kbitrate-mono-mp3
284+
audio-16khz-128kbitrate-mono-mp3
285+
audio-24khz-16bit-24kbps-mono-opus
286+
audio-24khz-16bit-48kbps-mono-opus
287+
audio-24khz-48kbitrate-mono-mp3
288+
audio-24khz-96kbitrate-mono-mp3
289+
audio-24khz-160kbitrate-mono-mp3
290+
audio-48khz-96kbitrate-mono-mp3
291+
audio-48khz-192kbitrate-mono-mp3
292+
ogg-16khz-16bit-mono-opus
293+
ogg-24khz-16bit-mono-opus
294+
ogg-48khz-16bit-mono-opus
295+
raw-8khz-8bit-mono-alaw
296+
raw-8khz-8bit-mono-mulaw
297+
raw-8khz-16bit-mono-pcm
298+
raw-16khz-16bit-mono-pcm
299+
raw-16khz-16bit-mono-truesilk
300+
raw-22050hz-16bit-mono-pcm
301+
raw-24khz-16bit-mono-pcm
302+
raw-24khz-16bit-mono-truesilk
303+
raw-44100hz-16bit-mono-pcm
304+
raw-48khz-16bit-mono-pcm
305+
webm-16khz-16bit-mono-opus
306+
webm-24khz-16bit-24kbps-mono-opus
307+
webm-24khz-16bit-mono-opus
308+
```
309+
310+
#### [NonStreaming](#tab/nonstreaming)
311+
312+
```
313+
riff-8khz-8bit-mono-alaw
314+
riff-8khz-8bit-mono-mulaw
315+
riff-8khz-16bit-mono-pcm
316+
riff-22050hz-16bit-mono-pcm
317+
riff-24khz-16bit-mono-pcm
318+
riff-44100hz-16bit-mono-pcm
319+
riff-48khz-16bit-mono-pcm
320+
```
321+
322+
***
306323

307324
> [!NOTE]
308325
> en-US-AriaNeural, en-US-JennyNeural and zh-CN-XiaoxiaoNeural are available in public preview in 48Khz output. Other voices support 24khz upsampled to 48khz output.

0 commit comments

Comments
 (0)