Skip to content

Commit 0a90aac

Browse files
committed
hd voices
1 parent d354631 commit 0a90aac

File tree

3 files changed

+36
-26
lines changed

3 files changed

+36
-26
lines changed

articles/ai-services/speech-service/high-definition-voices.md

Lines changed: 26 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ ms.custom: references_regions
1717

1818
[!INCLUDE [Feature preview](../includes/preview-feature.md)]
1919

20-
Azure AI speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
20+
Azure AI Speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
2121

2222
## Key features of neural text to speech HD voices
2323

@@ -52,22 +52,30 @@ Here's a comparison of features between Azure AI Speech HD voices, Azure OpenAI
5252

5353
## Supported Azure AI Speech HD voices
5454

55+
The Azure AI Speech HD voice values are in the format `voicename:basemodel:version`. The name before the colon, such as `en-US-Ava`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
56+
57+
Currently, `DragonHD` is the only base model available for Azure AI Speech HD voices. One version of the base model (`v1Neural`) is available for each voice persona. To ensure that you're using the latest version of the base model that we provide without having to make a code change, use the `LatestNeural` version.
58+
59+
For example, for the persona `en-US-Ava` you can specify two HD voice values:
60+
- `en-US-Ava:DragonHDLatestNeural`: Always uses the latest version of the base model that we provide later.
61+
- `en-US-Ava:DragonHDv1Neural`: Always uses the `v1Neural` version of the base model. When we release a new version of the base model, you need to update your code to use the new version.
62+
5563
The following table lists the Azure AI Speech HD voices that are currently available.
5664

57-
| HD voice name | Neural voice persona | Locale |
58-
|---------------|----------------------|--------|
59-
| de-DE-Seraphina:DragonHDLatestNeural | de-DE-Seraphina | de-DE |
60-
| en-US-Andrew:DragonHDLatestNeural | en-US-Andrew | en-US |
61-
| en-US-Andrew2:DragonHDLatestNeural | en-US-Andrew2 | en-US |
62-
| en-US-Aria:DragonHDLatestNeural | en-US-Aria | en-US |
63-
| en-US-Ava:DragonHDLatestNeural | en-US-Ava | en-US |
64-
| en-US-Davis:DragonHDLatestNeural | en-US-Davis | en-US |
65-
| en-US-Emma:DragonHDLatestNeural | en-US-Emma | en-US |
66-
| en-US-Emma2:DragonHDLatestNeural | en-US-Emma2 | en-US |
67-
| en-US-Jenny:DragonHDLatestNeural | en-US-Jenny | en-US |
68-
| en-US-Steffan:DragonHDLatestNeural | en-US-Steffan | en-US |
69-
| ja-JP-Masaru:DragonHDLatestNeural | ja-JP-Masaru | ja-JP |
70-
| zh-CN-Xiaochen:DragonHDLatestNeural | zh-CN-Xiaochen | zh-CN |
65+
| Neural voice persona | HD voices |
66+
|----------------------|-----------|
67+
| de-DE-Seraphina | de-DE-Seraphina:DragonHDLatestNeural<br/>de-DE-Seraphina:DragonHDv1Neural |
68+
| en-US-Andrew | en-US-Andrew:DragonHDLatestNeural<br/>en-US-Andrew:DragonHDv1Neural |
69+
| en-US-Andrew2 | en-US-Andrew2:DragonHDLatestNeural<br/>en-US-Andrew2:DragonHDv1Neural |
70+
| en-US-Aria | en-US-Aria:DragonHDLatestNeural<br/>en-US-Aria:DragonHDv1Neural |
71+
| en-US-Ava | en-US-Ava:DragonHDLatestNeural<br/>en-US-Ava:DragonHDv1Neural |
72+
| en-US-Davis | en-US-Davis:DragonHDLatestNeural<br/>en-US-Davis:DragonHDv1Neural |
73+
| en-US-Emma | en-US-Emma:DragonHDLatestNeural<br/>en-US-Emma:DragonHDv1Neural |
74+
| en-US-Emma2 | en-US-Emma2:DragonHDLatestNeural<br/>en-US-Emma2:DragonHDv1Neural |
75+
| en-US-Jenny | en-US-Jenny:DragonHDLatestNeural<br/>en-US-Jenny:DragonHDv1Neural |
76+
| en-US-Steffan | en-US-Steffan:DragonHDLatestNeural<br/>en-US-Steffan:DragonHDv1Neural |
77+
| ja-JP-Masaru | ja-JP-Masaru:DragonHDLatestNeural<br/>ja-JP-Masaru:DragonHDv1Neural |
78+
| zh-CN-Xiaochen | zh-CN-Xiaochen:DragonHDLatestNeural<br/>zh-CN-Xiaochen:DragonHDv1Neural |
7179

7280

7381
## How to use Azure AI Speech HD voices
@@ -79,14 +87,10 @@ Here are some key points to consider when using Azure AI Speech HD voices:
7987
- **Voice locale**: The locale in the voice name indicates its original language and region.
8088
- **Base models**:
8189
- HD voices come with a base model that understands the input text and predicts the speaking pattern accordingly. You can specify the desired model (such as DragonHDLatestNeural) according to the availability of each voice.
82-
- **SSML usage**: To reference a voice in SSML, use the format `voicename:basemodel`. The name before the colon, such as `en-US-Andrew`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
90+
- **SSML usage**: To reference a voice in SSML, use the format `voicename:basemodel:version`. The name before the colon, such as `de-DE-Seraphina`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
8391
- **Temperature parameter**:
84-
- The temperature value is a float ranging from 0 to 1, influencing the randomness of the output.
85-
- You can also adjust the temperature parameter to control the variation of outputs.
86-
- **Lower temperature**: Results in less randomness, leading to more predictable outputs.
87-
- **Higher temperature**: Increases randomness, allowing for more diverse outputs.
88-
- The default temperature is set at 1.0.
89-
- Less randomness yields more stable results, while more randomness offers variety but less consistency.
92+
- The temperature value is a float ranging from 0 to 1, influencing the randomness of the output. You can also adjust the temperature parameter to control the variation of outputs. Less randomness yields more stable results, while more randomness offers variety but less consistency.
93+
- Lower temperature results in less randomness, leading to more predictable outputs. Higher temperature increases randomness, allowing for more diverse outputs. The default temperature is set at 1.0.
9094

9195
Here's an example of how to use Azure AI Speech HD voices in SSML:
9296

articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,17 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 9/30/2024
5+
ms.date: 10/9/2024
66
ms.author: eur
77
ms.custom: references_regions
88
---
99

10+
### October 2024 release
11+
12+
#### Prebuilt high definition (HD) neural voice
13+
14+
Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features. For more information, see [What are Azure AI Speech high definition (HD) voices?](../../high-definition-voices.md).
15+
1016
### September 2024 release
1117

1218
#### Prebuilt neural voice

articles/ai-services/speech-service/releasenotes.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: eric-urban
77
ms.author: eur
88
ms.service: azure-ai-speech
99
ms.topic: release-notes
10-
ms.date: 9/30/2024
10+
ms.date: 10/9/2024
1111
ms.custom: references_regions
1212
# Customer intent: As a developer, I want to learn about new releases and features for Azure AI Speech.
1313
---
@@ -18,9 +18,9 @@ Azure AI Speech is updated on an ongoing basis. To stay up-to-date with recent d
1818

1919
## Recent highlights
2020

21-
* Fast transcription is now available in public preview. Fast transcription allows you to transcribe audio file to text accurately and synchronously, and supports diarization to recognize and separate multiple speakers on mono channel audio. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
21+
* Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. For more information, see [What are Azure AI Speech high definition (HD) voices?](../../high-definition-voices.md).
22+
* Fast transcription is now available in public preview. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
2223
* Video translation is now available in the Azure AI Speech service. For more information, see [What is video translation?](./video-translation-overview.md).
23-
* Personal voice is now generally available. For more information, see [What is personal voice?](./personal-voice-overview.md).
2424
* The Azure AI Speech service supports OpenAI text to speech voices. For more information, see [What are OpenAI text to speech voices?](./openai-voices.md).
2525
* The custom voice API is available for creating and managing [professional](./professional-voice-create-project.md) and [personal](./personal-voice-create-project.md) custom neural voice models.
2626

0 commit comments

Comments
 (0)