Merge pull request #217676 from eric-urban/eur/tts-cnv-updates

v-regandowner · web-flow · commit 1e554da6a7c0 · 2022-11-10T10:51:28.000-05:00
promote learn more about voice styles
diff --git a/articles/cognitive-services/Speech-Service/how-to-custom-voice-create-voice.md b/articles/cognitive-services/Speech-Service/how-to-custom-voice-create-voice.md
@@ -8,7 +8,7 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: speech-service
 ms.topic: how-to
-ms.date: 10/27/2022
+ms.date: 11/10/2022
 ms.author: eur
 ms.custom: references_regions
 ---
@@ -49,7 +49,7 @@ To create a custom neural voice in Speech Studio, follow these steps for one of
 1. Select the data that you want to use for training. Duplicate audio names will be removed from the training. Make sure the data you select don't contain the same audio names across multiple .zip files. Only successfully processed datasets can be selected for training. Check your data processing status if you do not see your training set in the list.
 1. Select a speaker file with the voice talent statement that corresponds to the speaker in your training data.
 1. Select **Next**.
-1. Optionally, you can check the box next to **Add my own test script** and select test scripts to upload. Each training generates 100 sample audio files automatically, to help you test the model with a default script. You can also provide your own test script with up to 100 utterances. The generated audio files are a combination of the automatic test scripts and custom test scripts. For more information, see [test script requirements](#test-script-requirements).
+1. Optionally, you can check the box next to **Add my own test script** and select test scripts to upload. Each training generates 100 sample audio files automatically, to help you test the model with a default script. You can also provide your own test script with up to 100 utterances for the default style. The generated audio files are a combination of the automatic test scripts and custom test scripts. For more information, see [test script requirements](#test-script-requirements).
 1. Enter a **Name** and **Description** to help you identify the model. Choose a name carefully. The model name will be used as the voice name in your [speech synthesis request](how-to-deploy-and-use-endpoint.md#use-your-custom-voice) via the SDK and SSML input. Only letters, numbers, and a few punctuation characters are allowed. Use different names for different neural voice models.
 1. Optionally, enter the **Description** to help you identify the model. A common use of the description  is to record the names of the data that you used to create the model.
 1. Select **Next**.
@@ -82,7 +82,9 @@ To create a custom neural voice in Speech Studio, follow these steps for one of
 1. Select one or more preset speaking styles to train. 
 1. Select the data that you want to use for training. Duplicate audio names will be removed from the training. Make sure the data you select don't contain the same audio names across multiple .zip files. Only successfully processed datasets can be selected for training. Check your data processing status if you do not see your training set in the list.
 1. Select **Next**.
-1. Optionally, you can add up to 10 custom speaking styles. Select **Add a custom style** and enter a custom style name of your choice. Select style samples as training data. 
+1. Optionally, you can add up to 10 custom speaking styles:
+    1. Select **Add a custom style** and thoughtfully enter a custom style name of your choice. This name will be used by your application within the `style` element of [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md#adjust-speaking-styles). You can also use the custom style name as SSML via the [Audio Content Creation](how-to-audio-content-creation.md) tool in [Speech Studio](https://speech.microsoft.com/portal/audiocontentcreation).
+    1. Select style samples as training data.
 1. Select **Next**.
 1. Select a speaker file with the voice talent statement that corresponds to the speaker in your training data.
 1. Select **Next**.
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/cli.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/cli.md
@@ -27,7 +27,7 @@ ms.author: eur
 Run the following command for speech synthesis to the default speaker output. You can modify the text to be synthesized and the voice.
 
 ```console
- spx synthesize --text "I'm excited to try text to speech" --voice "en-US-JennyNeural"
+spx synthesize --text "I'm excited to try text to speech" --voice "en-US-JennyNeural"
 ```
 
 > [!div class="nextstepaction"]
@@ -37,7 +37,19 @@ Run the following command for speech synthesis to the default speaker output. Yo
 > There is a known issue on Windows 11 that might affect some types of Secure Sockets Layer (SSL) and Transport Layer Security (TLS) connections. For more information, see the [troubleshooting guide](/azure/cognitive-services/speech-service/troubleshooting#connection-closed-or-timeout).
 
 If you don't set a voice name, the default voice for `en-US` will speak. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is "I'm excited to try text to speech" and you set `--voice "es-ES-ElviraNeural"`, the text is spoken in English with a Spanish accent. If the voice does not speak the language of the input text, the Speech service won't output synthesized audio.
-            
+
+## Remarks
+
+Now that you've completed the quickstart, here are some additional considerations:
+
+You can have finer control over voice styles, prosody, and other settings by using [Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+
+In the following example, the voice and style ('excited') are provided in the SSML block. 
+
+```console
+spx synthesize --ssml "<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xmlns:mstts='https://www.w3.org/2001/mstts' xml:lang='en-US'><voice name='en-US-JennyNeural'><mstts:express-as style='excited'>I'm excited to try text to speech</mstts:express-as></voice></speak>"
+```
+
 Run this command for information about additional speech synthesis options such as file input and output:
 ```console
 spx help synthesize
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/cpp.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/cpp.md
@@ -138,7 +138,7 @@ I'm excited to try text to speech
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/csharp.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/csharp.md
@@ -128,7 +128,7 @@ I'm excited to try text to speech
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/java.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/java.md
@@ -154,7 +154,7 @@ I'm excited to try text to speech
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/javascript.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/javascript.md
@@ -119,7 +119,7 @@ synthesis finished.
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/objectivec.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/objectivec.md
@@ -93,7 +93,7 @@ After you input some text and select the button in the app, you should hear the
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `SpeakText` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/python.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/python.md
@@ -101,7 +101,7 @@ I'm excited to try text to speech
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `speak_text_async` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/swift.md b/articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/swift.md
@@ -141,7 +141,7 @@ After you input some text and select the button in the app, you should hear the
 Now that you've completed the quickstart, here are some additional considerations:
 
 This quickstart uses the `SpeakText` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
-- For information about speech synthesis from a file, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
+- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
 - For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md). 
 
 ## Clean up resources
diff --git a/articles/cognitive-services/Speech-Service/speech-synthesis-markup.md b/articles/cognitive-services/Speech-Service/speech-synthesis-markup.md
@@ -125,15 +125,17 @@ Styles, style degree, and roles are supported for a subset of neural voices. If
 
 | Attribute | Description | Required or optional |
 | ---------- | ---------- | -------------------- |
-| `style` | Specifies the speaking style. Speaking styles are voice specific. | Required if adjusting the speaking style for a neural voice. If you're using `mstts:express-as`, the style must be provided. If an invalid value is provided, this element is ignored.       |
+| `style` | Specifies the [prebuilt](language-support.md?tabs=stt-tts#voice-styles-and-roles) or [custom](how-to-custom-voice-create-voice.md?tabs=multistyle#train-your-custom-neural-voice-model) speaking style. Speaking styles are voice specific. | Required if adjusting the speaking style for a neural voice. If you're using `mstts:express-as`, the style must be provided. If an invalid value is provided, this element is ignored.|
 | `styledegree` | Specifies the intensity of the speaking style. **Accepted values**: 0.01 to 2 inclusive. The default value is 1, which means the predefined style intensity. The minimum unit is 0.01, which results in a slight tendency for the target style. A value of 2 results in a doubling of the default style intensity. | Optional. If you don't set the `style` attribute, the `styledegree` attribute is ignored. Speaking style degree adjustments are supported for Chinese (Mandarin, Simplified) neural voices.|
 | `role`| Specifies the speaking role-play. The voice acts as a different age and gender, but the voice name isn't changed. | Optional. Role adjustments are supported for these Chinese (Mandarin, Simplified) neural voices: `zh-CN-XiaomoNeural`, `zh-CN-XiaoxuanNeural`, `zh-CN-YunxiNeural`, and `zh-CN-YunyeNeural`. |
 
 ### Style
 
 You use the `mstts:express-as` element to express emotions like cheerfulness, empathy, and calm. You can also optimize the voice for different scenarios like customer service, newscast, and voice assistant.
 
-For a list of supported styles per neural voice, see [supported voice styles and roles](language-support.md?tabs=stt-tts#voice-styles-and-roles).
+For a list of supported styles for prebuilt neural voices, see [supported voice styles and roles](language-support.md?tabs=stt-tts#voice-styles-and-roles).
+
+To use your [custom style](how-to-custom-voice-create-voice.md?tabs=multistyle#train-your-custom-neural-voice-model), specify the style name that you entered in Speech Studio.
 
 **Syntax**
 
@@ -928,7 +930,7 @@ All elements from the [MathML 2.0](https://www.w3.org/TR/MathML2/) and [MathML 3
 > [!NOTE]
 > If an element is not recognized, it will be ignored, and the child elements within it will still be processed.
 
-The MathML entities are not supported by XML syntax, so you must use the their corresponding [unicode characters](https://www.w3.org/2003/entities/2007/htmlmathml.json) to represent the entities, for example, the entity `&copy;` should be represented by its unicode characters `&#x00A9;`, otherwise an error will occur.
+The MathML entities are not supported by XML syntax, so you must use the corresponding [unicode characters](https://www.w3.org/2003/entities/2007/htmlmathml.json) to represent the entities, for example, the entity `&copy;` should be represented by its unicode characters `&#x00A9;`, otherwise an error will occur.
 
 ## Viseme element
 
diff --git a/articles/cognitive-services/Speech-Service/toc.yml b/articles/cognitive-services/Speech-Service/toc.yml
@@ -134,12 +134,14 @@ items:
       items:
         - name: What is Custom Neural Voice?
           href: custom-neural-voice.md
+          displayName: cnv
         - name: Custom Neural Voice Lite
           href: custom-neural-voice-lite.md
+          displayName: cnv
         - name: Create a Custom Neural Voice 
           items:
           - name: Create a project
-            displayName: 'custom voice, neural voice'
+            displayName: custom voice, neural voice, cnv
             href: how-to-custom-voice.md
           - name: Set up voice talent
             href: how-to-custom-voice-talent.md
@@ -154,7 +156,8 @@ items:
         - name: How to record voice samples
           href: record-custom-voice-samples.md
     - name: Audio Content Creation
-      href: how-to-audio-content-creation.md 
+      href: how-to-audio-content-creation.md
+      displayName: acc
 - name: Speech translation
   items:
     - name: Speech translation overview