You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure AI speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
20
+
Azure AI Speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
21
21
22
22
## Key features of neural text to speech HD voices
23
23
@@ -52,22 +52,30 @@ Here's a comparison of features between Azure AI Speech HD voices, Azure OpenAI
52
52
53
53
## Supported Azure AI Speech HD voices
54
54
55
+
The Azure AI Speech HD voice values are in the format `voicename:basemodel:version`. The name before the colon, such as `en-US-Ava`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
56
+
57
+
Currently, `DragonHD` is the only base model available for Azure AI Speech HD voices. One version of the base model (`v1Neural`) is available for each voice persona. To ensure that you're using the latest version of the base model that we provide without having to make a code change, use the `LatestNeural` version.
58
+
59
+
For example, for the persona `en-US-Ava` you can specify two HD voice values:
60
+
-`en-US-Ava:DragonHDLatestNeural`: Always uses the latest version of the base model that we provide later.
61
+
-`en-US-Ava:DragonHDv1Neural`: Always uses the `v1Neural` version of the base model. When we release a new version of the base model, you need to update your code to use the new version.
62
+
55
63
The following table lists the Azure AI Speech HD voices that are currently available.
@@ -79,14 +87,10 @@ Here are some key points to consider when using Azure AI Speech HD voices:
79
87
-**Voice locale**: The locale in the voice name indicates its original language and region.
80
88
-**Base models**:
81
89
- HD voices come with a base model that understands the input text and predicts the speaking pattern accordingly. You can specify the desired model (such as DragonHDLatestNeural) according to the availability of each voice.
82
-
-**SSML usage**: To reference a voice in SSML, use the format `voicename:basemodel`. The name before the colon, such as `en-US-Andrew`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
90
+
-**SSML usage**: To reference a voice in SSML, use the format `voicename:basemodel:version`. The name before the colon, such as `de-DE-Seraphina`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
83
91
-**Temperature parameter**:
84
-
- The temperature value is a float ranging from 0 to 1, influencing the randomness of the output.
85
-
- You can also adjust the temperature parameter to control the variation of outputs.
86
-
-**Lower temperature**: Results in less randomness, leading to more predictable outputs.
87
-
-**Higher temperature**: Increases randomness, allowing for more diverse outputs.
88
-
- The default temperature is set at 1.0.
89
-
- Less randomness yields more stable results, while more randomness offers variety but less consistency.
92
+
- The temperature value is a float ranging from 0 to 1, influencing the randomness of the output. You can also adjust the temperature parameter to control the variation of outputs. Less randomness yields more stable results, while more randomness offers variety but less consistency.
93
+
- Lower temperature results in less randomness, leading to more predictable outputs. Higher temperature increases randomness, allowing for more diverse outputs. The default temperature is set at 1.0.
90
94
91
95
Here's an example of how to use Azure AI Speech HD voices in SSML:
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md
+7-1Lines changed: 7 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,17 @@
2
2
author: eric-urban
3
3
ms.service: azure-ai-speech
4
4
ms.topic: include
5
-
ms.date: 9/30/2024
5
+
ms.date: 10/9/2024
6
6
ms.author: eur
7
7
ms.custom: references_regions
8
8
---
9
9
10
+
### October 2024 release
11
+
12
+
#### Prebuilt high definition (HD) neural voice
13
+
14
+
Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features. For more information, see [What are Azure AI Speech high definition (HD) voices?](../../high-definition-voices.md).
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/releasenotes.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: eric-urban
7
7
ms.author: eur
8
8
ms.service: azure-ai-speech
9
9
ms.topic: release-notes
10
-
ms.date: 9/30/2024
10
+
ms.date: 10/9/2024
11
11
ms.custom: references_regions
12
12
# Customer intent: As a developer, I want to learn about new releases and features for Azure AI Speech.
13
13
---
@@ -18,9 +18,9 @@ Azure AI Speech is updated on an ongoing basis. To stay up-to-date with recent d
18
18
19
19
## Recent highlights
20
20
21
-
* Fast transcription is now available in public preview. Fast transcription allows you to transcribe audio file to text accurately and synchronously, and supports diarization to recognize and separate multiple speakers on mono channel audio. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
21
+
* Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. For more information, see [What are Azure AI Speech high definition (HD) voices?](../../high-definition-voices.md).
22
+
* Fast transcription is now available in public preview. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
22
23
* Video translation is now available in the Azure AI Speech service. For more information, see [What is video translation?](./video-translation-overview.md).
23
-
* Personal voice is now generally available. For more information, see [What is personal voice?](./personal-voice-overview.md).
24
24
* The Azure AI Speech service supports OpenAI text to speech voices. For more information, see [What are OpenAI text to speech voices?](./openai-voices.md).
25
25
* The custom voice API is available for creating and managing [professional](./professional-voice-create-project.md) and [personal](./personal-voice-create-project.md) custom neural voice models.
0 commit comments