Skip to content

Commit 3258896

Browse files
Merge pull request #273173 from eric-urban/eur/openai-voices
openai voices differences
2 parents 114dac7 + 2ac4d84 commit 3258896

File tree

3 files changed

+10
-5
lines changed

3 files changed

+10
-5
lines changed

articles/ai-services/speech-service/includes/language-support/multilingual-voices.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@ ms.author: v-baolianzou
1616

1717
<sup>2</sup> The neural voice is a multilingual voice in Azure AI Speech. All multilingual voices can speak in the language in default locale of the input text without [using SSML](../../speech-synthesis-markup-voice.md#adjust-speaking-languages). However, you can still use the `<lang xml:lang>` element to adjust the speaking accent of each language to set preferred accent such as British accent (`en-GB`) for English. Check the [full list](https://speech.microsoft.com/portal/voicegallery) of supported locales through SSML.
1818

19-
<sup>3</sup> The OpenAI text to speech voices in Azure AI Speech are in public preview and only available in North Central US (`northcentralus`) and Sweden Central (`swedencentral`). Locales not listed for OpenAI voices aren't supported by design.
19+
<sup>3</sup> The OpenAI text to speech voices in Azure AI Speech are in public preview and only available in North Central US (`northcentralus`) and Sweden Central (`swedencentral`). Locales not listed for OpenAI voices aren't supported. For information about additional differences between OpenAI text to speech voices and Azure AI Speech text to speech voices, see [OpenAI text to speech voices](../../openai-voices.md#openai-text-to-speech-voices-via-azure-openai-service-or-via-azure-ai-speech).

articles/ai-services/speech-service/includes/language-support/tts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,5 +165,5 @@ ms.custom: references_regions
165165

166166
<sup>3</sup> The neural voice is a multilingual voice in Azure AI Speech.
167167

168-
<sup>4</sup> The OpenAI text to speech voices in Azure AI Speech are in public preview and only available in North Central US (`northcentralus`) and Sweden Central (`swedencentral`).
168+
<sup>4</sup> The OpenAI text to speech voices in Azure AI Speech are in public preview and only available in North Central US (`northcentralus`) and Sweden Central (`swedencentral`). Locales not listed for OpenAI voices aren't supported. For information about additional differences between OpenAI text to speech voices and Azure AI Speech text to speech voices, see [OpenAI text to speech voices](../../openai-voices.md#openai-text-to-speech-voices-via-azure-openai-service-or-via-azure-ai-speech).
169169

articles/ai-services/speech-service/openai-voices.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: eur
77
manager: nitinme
88
ms.service: azure-ai-speech
99
ms.topic: overview
10-
ms.date: 2/1/2024
10+
ms.date: 4/23/2024
1111
ms.reviewer: v-baolianzou
1212
ms.custom: references_regions
1313
#customer intent: As a user who implements text to speech, I want to understand the options and differences between available OpenAI text to speech voices in Azure AI services.
@@ -55,13 +55,18 @@ Here's a comparison of features between OpenAI text to speech voices in Azure Op
5555
| **Real-time or batch synthesis** | Real-time | Real-time and batch synthesis | Real-time and batch synthesis |
5656
| **Latency** | greater than 500 ms | greater than 500 ms | less than 300 ms |
5757
| **Sample rate of synthesized audio** | 24 kHz | 8, 16, 24, and 48 kHz | 8, 16, 24, and 48 kHz |
58-
| **Speech output audio format** | opus, mp3, aac, flac | opus, mp3, pcm, truesilk | opus, mp3, pcm, truesilk |
58+
| **Speech output audio format** | opus, mp3, aac, flac | opus, mp3, pcm, truesilk | opus, mp3, pcm, truesilk |
59+
60+
There are additional features and capabilities available in Azure AI Speech that aren't available with OpenAI voices. For example:
61+
- OpenAI text to speech voices in Azure AI Speech [only support a subset of SSML elements](#ssml-elements-supported-by-openai-text-to-speech-voices-in-azure-ai-speech). Azure AI Speech voices support the full set of SSML elements.
62+
- Azure AI Speech supports [word boundary events](./how-to-speech-synthesis.md#subscribe-to-synthesizer-events). OpenAI voices don't support word boundary events.
63+
5964

6065
## SSML elements supported by OpenAI text to speech voices in Azure AI Speech
6166

6267
The [Speech Synthesis Markup Language (SSML)](./speech-synthesis-markup.md) with input text determines the structure, content, and other characteristics of the text to speech output. For example, you can use SSML to define a paragraph, a sentence, a break or a pause, or silence. You can wrap text with event tags such as bookmark or viseme that can be processed later by your application.
6368

64-
The following table outlines the Speech Synthesis Markup Language (SSML) elements supported by OpenAI text to speech voices in Azure AI speech. Only a subset of SSML tags are supported for OpenAI voices. See [SSML document structure and events](speech-synthesis-markup-structure.md) for more information.
69+
The following table outlines the Speech Synthesis Markup Language (SSML) elements supported by OpenAI text to speech voices in Azure AI speech. Only the following subset of SSML tags are supported for OpenAI voices. See [SSML document structure and events](speech-synthesis-markup-structure.md) for more information.
6570

6671
| SSML element name | Description |
6772
| --- | --- |

0 commit comments

Comments
 (0)