You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/high-definition-voices.md
+37-20Lines changed: 37 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,14 +8,12 @@ ms.reviewer: eur
8
8
manager: nitinme
9
9
ms.service: azure-ai-speech
10
10
ms.topic: overview
11
-
ms.date: 10/9/2024
11
+
ms.date: 4/8/2025
12
12
ms.custom: references_regions
13
13
#customer intent: As a user who implements text to speech, I want to understand the options and differences between available neural text to speech HD voices in Azure AI Speech.
Azure AI Speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
21
19
@@ -29,7 +27,6 @@ The following are the key features of Azure AI Speech HD voices:
29
27
|**Conversational**| Neural text to speech HD voices can replicate natural speech patterns, including spontaneous pauses and emphasis. When given conversational text, the model can reproduce common phonemes like pauses and filler words. The generated voice sounds as if someone is conversing directly with you. |
30
28
|**Prosody variations**| Neural text to speech HD voices introduce slight variations in each output to enhance realism. These variations make the speech sound more natural, as human voices naturally exhibit variation. |
31
29
|**High fidelity**| The primary objective of neural text to speech HD voices is to generate high-fidelity audio. The synthetic speech produced by our system can closely mimic human speech in both quality and naturalness. |
32
-
|**Version control**| With neural text to speech HD voices, we release different versions of the same voice, each with a unique base model size and recipe. This offers you the opportunity to experience new voice variations or continue using a specific version of a voice. |
33
30
34
31
## Comparison of Azure AI Speech HD voices to other Azure text to speech voices
35
32
@@ -61,21 +58,41 @@ For example, for the persona `en-US-Ava` you can specify the following HD voice
61
58
62
59
The following table lists the Azure AI Speech HD voices that are currently available.
<sup>1</sup> The neural voice is available in public preview. Voices and styles in public preview are only available in these service [regions](../../regions.md): East US, West Europe, and Southeast Asia.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/language-support/stt.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -134,7 +134,7 @@ ms.author: eur
134
134
|`so-SO`| Somali (Somalia) | No | Plain text |
135
135
|`sq-AL`| Albanian (Albania) | No | Plain text |
136
136
|`sr-RS`| Serbian (Cyrillic, Serbia) | No | Plain text |
137
-
|`sv-SE`| Swedish (Sweden) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
137
+
|`sv-SE`| Swedish (Sweden) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
138
138
|`sw-KE`| Kiswahili (Kenya) | No | Plain text |
139
139
|`sw-TZ`| Kiswahili (Tanzania) | No | Plain text |
0 commit comments