hd voices

eric-urban · eric-urban · commit 0a90aac90af8 · 2024-10-09T09:05:52.000-07:00
diff --git a/articles/ai-services/speech-service/high-definition-voices.md b/articles/ai-services/speech-service/high-definition-voices.md
@@ -17,7 +17,7 @@ ms.custom: references_regions
 
 [!INCLUDE [Feature preview](../includes/preview-feature.md)]
 
-Azure AI speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
+Azure AI Speech continues to advance in the field of text to speech technology with the introduction of neural text to speech high definition (HD) voices. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features.
 
 ## Key features of neural text to speech HD voices
 
@@ -52,22 +52,30 @@ Here's a comparison of features between Azure AI Speech HD voices, Azure OpenAI
 
 ## Supported Azure AI Speech HD voices
 
+The Azure AI Speech HD voice values are in the format `voicename:basemodel:version`. The name before the colon, such as `en-US-Ava`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
+
+Currently, `DragonHD` is the only base model available for Azure AI Speech HD voices. One version of the base model (`v1Neural`) is available for each voice persona. To ensure that you're using the latest version of the base model that we provide without having to make a code change, use the `LatestNeural` version.
+
+For example, for the persona `en-US-Ava` you can specify two HD voice values:
+- `en-US-Ava:DragonHDLatestNeural`: Always uses the latest version of the base model that we provide later.
+- `en-US-Ava:DragonHDv1Neural`: Always uses the `v1Neural` version of the base model. When we release a new version of the base model, you need to update your code to use the new version.
+
 The following table lists the Azure AI Speech HD voices that are currently available.
 
-| HD voice name | Neural voice persona | Locale |
-|---------------|----------------------|--------|
-| de-DE-Seraphina:DragonHDLatestNeural | de-DE-Seraphina | de-DE |
-| en-US-Andrew:DragonHDLatestNeural | en-US-Andrew | en-US |
-| en-US-Andrew2:DragonHDLatestNeural | en-US-Andrew2 | en-US |
-| en-US-Aria:DragonHDLatestNeural | en-US-Aria | en-US |
-| en-US-Ava:DragonHDLatestNeural | en-US-Ava | en-US |
-| en-US-Davis:DragonHDLatestNeural | en-US-Davis | en-US |
-| en-US-Emma:DragonHDLatestNeural | en-US-Emma | en-US |
-| en-US-Emma2:DragonHDLatestNeural | en-US-Emma2 | en-US |
-| en-US-Jenny:DragonHDLatestNeural | en-US-Jenny | en-US |
-| en-US-Steffan:DragonHDLatestNeural | en-US-Steffan | en-US |
-| ja-JP-Masaru:DragonHDLatestNeural | ja-JP-Masaru | ja-JP |
-| zh-CN-Xiaochen:DragonHDLatestNeural | zh-CN-Xiaochen | zh-CN |
+| Neural voice persona | HD voices | 
+|----------------------|-----------|
+| de-DE-Seraphina | de-DE-Seraphina:DragonHDLatestNeural<br/>de-DE-Seraphina:DragonHDv1Neural |
+| en-US-Andrew | en-US-Andrew:DragonHDLatestNeural<br/>en-US-Andrew:DragonHDv1Neural |
+| en-US-Andrew2 | en-US-Andrew2:DragonHDLatestNeural<br/>en-US-Andrew2:DragonHDv1Neural |
+| en-US-Aria | en-US-Aria:DragonHDLatestNeural<br/>en-US-Aria:DragonHDv1Neural |
+| en-US-Ava | en-US-Ava:DragonHDLatestNeural<br/>en-US-Ava:DragonHDv1Neural |
+| en-US-Davis | en-US-Davis:DragonHDLatestNeural<br/>en-US-Davis:DragonHDv1Neural |
+| en-US-Emma | en-US-Emma:DragonHDLatestNeural<br/>en-US-Emma:DragonHDv1Neural |
+| en-US-Emma2 | en-US-Emma2:DragonHDLatestNeural<br/>en-US-Emma2:DragonHDv1Neural |
+| en-US-Jenny | en-US-Jenny:DragonHDLatestNeural<br/>en-US-Jenny:DragonHDv1Neural |
+| en-US-Steffan | en-US-Steffan:DragonHDLatestNeural<br/>en-US-Steffan:DragonHDv1Neural |
+| ja-JP-Masaru | ja-JP-Masaru:DragonHDLatestNeural<br/>ja-JP-Masaru:DragonHDv1Neural |
+| zh-CN-Xiaochen | zh-CN-Xiaochen:DragonHDLatestNeural<br/>zh-CN-Xiaochen:DragonHDv1Neural |
 
 
 ## How to use Azure AI Speech HD voices
@@ -79,14 +87,10 @@ Here are some key points to consider when using Azure AI Speech HD voices:
 - **Voice locale**: The locale in the voice name indicates its original language and region.
 - **Base models**:
   - HD voices come with a base model that understands the input text and predicts the speaking pattern accordingly. You can specify the desired model (such as DragonHDLatestNeural) according to the availability of each voice.
-- **SSML usage**: To reference a voice in SSML, use the format `voicename:basemodel`. The name before the colon, such as `en-US-Andrew`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
+- **SSML usage**: To reference a voice in SSML, use the format `voicename:basemodel:version`. The name before the colon, such as `de-DE-Seraphina`, is the voice persona name and its original locale. The base model is tracked by versions in subsequent updates.
 - **Temperature parameter**:
-  - The temperature value is a float ranging from 0 to 1, influencing the randomness of the output.
-  - You can also adjust the temperature parameter to control the variation of outputs.
-  - **Lower temperature**: Results in less randomness, leading to more predictable outputs.
-  - **Higher temperature**: Increases randomness, allowing for more diverse outputs.
-  - The default temperature is set at 1.0.
-  - Less randomness yields more stable results, while more randomness offers variety but less consistency.
+  - The temperature value is a float ranging from 0 to 1, influencing the randomness of the output. You can also adjust the temperature parameter to control the variation of outputs. Less randomness yields more stable results, while more randomness offers variety but less consistency.
+  - Lower temperature results in less randomness, leading to more predictable outputs. Higher temperature increases randomness, allowing for more diverse outputs. The default temperature is set at 1.0.
 
 Here's an example of how to use Azure AI Speech HD voices in SSML:
 
diff --git a/articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md b/articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md
@@ -2,11 +2,17 @@
 author: eric-urban
 ms.service: azure-ai-speech
 ms.topic: include
-ms.date: 9/30/2024
+ms.date: 10/9/2024
 ms.author: eur
 ms.custom: references_regions
 ---
 
+### October 2024 release
+
+#### Prebuilt high definition (HD) neural voice
+
+Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features. For more information, see [What are Azure AI Speech high definition (HD) voices?](../../high-definition-voices.md).
+
 ### September 2024 release
 
 #### Prebuilt neural voice
diff --git a/articles/ai-services/speech-service/releasenotes.md b/articles/ai-services/speech-service/releasenotes.md
@@ -7,7 +7,7 @@ author: eric-urban
 ms.author: eur
 ms.service: azure-ai-speech
 ms.topic: release-notes
-ms.date: 9/30/2024
+ms.date: 10/9/2024
 ms.custom: references_regions
 # Customer intent: As a developer, I want to learn about new releases and features for Azure AI Speech.
 ---
@@ -18,9 +18,9 @@ Azure AI Speech is updated on an ongoing basis. To stay up-to-date with recent d
 
 ## Recent highlights
 
-* Fast transcription is now available in public preview. Fast transcription allows you to transcribe audio file to text accurately and synchronously, and supports diarization to recognize and separate multiple speakers on mono channel audio. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
+* Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. For more information, see [What are Azure AI Speech high definition (HD) voices?](../../high-definition-voices.md).
+* Fast transcription is now available in public preview. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
 * Video translation is now available in the Azure AI Speech service. For more information, see [What is video translation?](./video-translation-overview.md).
-* Personal voice is now generally available. For more information, see [What is personal voice?](./personal-voice-overview.md).
 * The Azure AI Speech service supports OpenAI text to speech voices. For more information, see [What are OpenAI text to speech voices?](./openai-voices.md). 
 * The custom voice API is available for creating and managing [professional](./professional-voice-create-project.md) and [personal](./personal-voice-create-project.md) custom neural voice models.