Skip to content

Commit b32a199

Browse files
authored
Merge pull request #3993 from eric-urban/eur/voices-locales
update voices and locales
2 parents ec7fe80 + 49e4d2a commit b32a199

File tree

9 files changed

+155
-35
lines changed

9 files changed

+155
-35
lines changed

articles/ai-services/speech-service/high-definition-voices.md

Lines changed: 37 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,12 @@ ms.reviewer: eur
88
manager: nitinme
99
ms.service: azure-ai-speech
1010
ms.topic: overview
11-
ms.date: 10/9/2024
11+
ms.date: 4/8/2025
1212
ms.custom: references_regions
1313
#customer intent: As a user who implements text to speech, I want to understand the options and differences between available neural text to speech HD voices in Azure AI Speech.
1414
---
1515

16-
# What are high definition voices? (Preview)
17-
18-
[!INCLUDE [Feature preview](../includes/preview-feature.md)]
16+
# What are high definition voices?
1917

2018
Azure AI Speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
2119

@@ -29,7 +27,6 @@ The following are the key features of Azure AI Speech HD voices:
2927
| **Conversational** | Neural text to speech HD voices can replicate natural speech patterns, including spontaneous pauses and emphasis. When given conversational text, the model can reproduce common phonemes like pauses and filler words. The generated voice sounds as if someone is conversing directly with you. |
3028
| **Prosody variations** | Neural text to speech HD voices introduce slight variations in each output to enhance realism. These variations make the speech sound more natural, as human voices naturally exhibit variation. |
3129
| **High fidelity** | The primary objective of neural text to speech HD voices is to generate high-fidelity audio. The synthetic speech produced by our system can closely mimic human speech in both quality and naturalness. |
32-
| **Version control** | With neural text to speech HD voices, we release different versions of the same voice, each with a unique base model size and recipe. This offers you the opportunity to experience new voice variations or continue using a specific version of a voice. |
3330

3431
## Comparison of Azure AI Speech HD voices to other Azure text to speech voices
3532

@@ -61,21 +58,41 @@ For example, for the persona `en-US-Ava` you can specify the following HD voice
6158

6259
The following table lists the Azure AI Speech HD voices that are currently available.
6360

64-
| Neural voice persona | HD voices |
65-
|----------------------|-----------|
66-
| de-DE-Seraphina | de-DE-Seraphina:DragonHDLatestNeural|
67-
| en-US-Andrew | en-US-Andrew:DragonHDLatestNeural|
68-
| en-US-Andrew2 | en-US-Andrew2:DragonHDLatestNeural|
69-
| en-US-Aria | en-US-Aria:DragonHDLatestNeural |
70-
| en-US-Ava | en-US-Ava:DragonHDLatestNeural|
71-
| en-US-Brian | en-US-Brian:DragonHDLatestNeural|
72-
| en-US-Davis | en-US-Davis:DragonHDLatestNeural|
73-
| en-US-Emma | en-US-Emma:DragonHDLatestNeural |
74-
| en-US-Emma2 | en-US-Emma2:DragonHDLatestNeural |
75-
| en-US-Jenny | en-US-Jenny:DragonHDLatestNeural |
76-
| en-US-Steffan | en-US-Steffan:DragonHDLatestNeural |
77-
| ja-JP-Masaru | ja-JP-Masaru:DragonHDLatestNeural|
78-
| zh-CN-Xiaochen | zh-CN-Xiaochen:DragonHDLatestNeural |
61+
| Persona | Full Name | Status |
62+
|-----------|-----------|--------|
63+
| de-DE-Florian | de-DE-Florian:DragonHDLatestNeural | GA |
64+
| de-DE-Seraphina | de-DE-Seraphina:DragonHDLatestNeural | GA |
65+
| en-US-Adam | en-US-Adam:DragonHDLatestNeural | GA |
66+
| en-US-Andrew | en-US-Andrew:DragonHDLatestNeural | GA |
67+
| en-US-Andrew2 | en-US-Andrew2:DragonHDLatestNeural | GA |
68+
| en-US-Ava | en-US-Ava:DragonHDLatestNeural | GA |
69+
| en-US-Brian | en-US-Brian:DragonHDLatestNeural | GA |
70+
| en-US-Davis | en-US-Davis:DragonHDLatestNeural | GA |
71+
| en-US-Emma | en-US-Emma:DragonHDLatestNeural | GA |
72+
| en-US-Emma | en-US-Emma2:DragonHDLatestNeural | GA |
73+
| en-US-Steffan | en-US-Steffan:DragonHDLatestNeural | GA |
74+
| en-US-Alloy | en-US-Alloy:DragonHDLatestNeural | Preview |
75+
| en-US-Andrew | en-US-Andrew3:DragonHDLatestNeural | Preview |
76+
| en-US-Aria | en-US-Aria:DragonHDLatestNeural | Preview |
77+
| en-US-Ava | en-US-Ava3:DragonHDLatestNeural | Preview |
78+
| en-US-Jenny | en-US-Jenny:DragonHDLatestNeural | Preview |
79+
| en-US-MultiTalker-Ava-Andrew | en-US-MultiTalker-Ava-Andrew:DragonHDLatestNeural | Preview |
80+
| en-US-Nova | en-US-Nova:DragonHDLatestNeural | Preview |
81+
| en-US-Phoebe | en-US-Phoebe:DragonHDLatestNeural | Preview |
82+
| en-US-Serena | en-US-Serena:DragonHDLatestNeural | Preview |
83+
| es-ES-Tristan | es-ES-Tristan:DragonHDLatestNeural | GA |
84+
| es-ES-Ximena | es-ES-Ximena:DragonHDLatestNeural | GA |
85+
| fr-FR-Remy | fr-FR-Remy:DragonHDLatestNeural | GA |
86+
| fr-FR-Vivienne | fr-FR-Vivienne:DragonHDLatestNeural | GA |
87+
| ja-JP-Masaru | ja-JP-Masaru:DragonHDLatestNeural | GA |
88+
| ja-JP-Nanami | ja-JP-Nanami:DragonHDLatestNeural | GA |
89+
| zh-CN-Xiaochen | zh-CN-Xiaochen:DragonHDFlashLatestNeural | Preview |
90+
| zh-CN-Xiaoxiao | zh-CN-Xiaoxiao:DragonHDFlashLatestNeural | Preview |
91+
| zh-CN-Xiaoxiao2 | zh-CN-Xiaoxiao2:DragonHDFlashLatestNeural | Preview |
92+
| zh-CN-Yunxiao | zh-CN-Yunxiao:DragonHDFlashLatestNeural | Preview |
93+
| zh-CN-Yunyi | zh-CN-Yunyi:DragonHDFlashLatestNeural | Preview |
94+
| zh-CN-Xiaochen | zh-CN-Xiaochen:DragonHDLatestNeural | GA |
95+
| zh-CN-Yunfan | zh-CN-Yunfan:DragonHDLatestNeural | GA |
7996

8097
## How to use Azure AI Speech HD voices
8198

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
author: eric-urban
3+
ms.service: azure-ai-speech
4+
ms.date: 4/8/2025
5+
ms.topic: include
6+
ms.author: eur
7+
---
8+
9+
| Locale (BCP-47) | Language | Text to speech voices |
10+
| ----- | ----- | ----- |
11+
| `en-US` | English (United States) |
12+
| `en-US-MultiTalker-Ava-Andrew:DragonHDLatestNeural`<sup>1</sup> (Neutral) |
13+
14+
<sup>1</sup> The neural voice is available in public preview. Voices and styles in public preview are only available in these service [regions](../../regions.md): East US, West Europe, and Southeast Asia.
15+

articles/ai-services/speech-service/includes/language-support/multilingual-voices.md

Lines changed: 7 additions & 2 deletions
Large diffs are not rendered by default.

articles/ai-services/speech-service/includes/language-support/stt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ ms.author: eur
134134
| `so-SO` | Somali (Somalia) | No | Plain text |
135135
| `sq-AL` | Albanian (Albania) | No | Plain text |
136136
| `sr-RS` | Serbian (Cyrillic, Serbia) | No | Plain text |
137-
| `sv-SE` | Swedish (Sweden) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
137+
| `sv-SE` | Swedish (Sweden) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
138138
| `sw-KE` | Kiswahili (Kenya) | No | Plain text |
139139
| `sw-TZ` | Kiswahili (Tanzania) | No | Plain text |
140140
| `ta-IN` | Tamil (India) | No | Plain text |

0 commit comments

Comments
 (0)