|
1 | 1 | ---
|
2 | 2 | title: Speech Synthesis Markup Language (SSML) overview - Speech service
|
3 | 3 | titleSuffix: Azure AI services
|
4 |
| -description: Use the Speech Synthesis Markup Language to control pronunciation and prosody in text to speech. |
| 4 | +description: Learn how to use the Speech Synthesis Markup Language to control pronunciation and prosody in text to speech. |
5 | 5 | services: cognitive-services
|
6 | 6 | author: eric-urban
|
7 | 7 | manager: nitinme
|
8 | 8 | ms.service: cognitive-services
|
9 | 9 | ms.subservice: speech-service
|
10 | 10 | ms.topic: how-to
|
11 |
| -ms.date: 11/30/2022 |
| 11 | +ms.date: 8/16/2023 |
12 | 12 | ms.author: eur
|
13 | 13 | ---
|
14 | 14 |
|
15 | 15 | # Speech Synthesis Markup Language (SSML) overview
|
16 | 16 |
|
17 |
| -Speech Synthesis Markup Language (SSML) is an XML-based markup language that can be used to fine-tune the text to speech output attributes such as pitch, pronunciation, speaking rate, volume, and more. You have more control and flexibility compared to plain text input. |
| 17 | +Speech Synthesis Markup Language (SSML) is an XML-based markup language that you can use to fine-tune your text to speech output attributes such as pitch, pronunciation, speaking rate, volume, and more. It gives you more control and flexibility than plain text input. |
18 | 18 |
|
19 | 19 | > [!TIP]
|
20 |
| -> You can hear voices in different styles and pitches reading example text via the [Voice Gallery](https://speech.microsoft.com/portal/voicegallery). |
| 20 | +> You can hear voices in different styles and pitches reading example text by using the [Voice Gallery](https://speech.microsoft.com/portal/voicegallery). |
21 | 21 |
|
22 |
| -## Scenarios |
| 22 | +## Use case scenarios |
23 | 23 |
|
24 |
| -You can use SSML to: |
| 24 | +SSML is designed to give you flexibility in how you want your speech output to sound, and it provides different properties for how you can customize that output. You can use SSML to: |
25 | 25 |
|
26 |
| -- [Define the input text structure](speech-synthesis-markup-structure.md) that determines the structure, content, and other characteristics of the text to speech output. For example, you can use SSML to define a paragraph, a sentence, a break or a pause, or silence. You can wrap text with event tags such as bookmark or viseme that can be processed later by your application. |
27 |
| -- [Choose the voice](speech-synthesis-markup-voice.md), language, name, style, and role. You can use multiple voices in a single SSML document. Adjust the emphasis, speaking rate, pitch, and volume. You can also use SSML to insert pre-recorded audio, such as a sound effect or a musical note. |
| 26 | +- [Define the input text structure](speech-synthesis-markup-structure.md) that determines the structure, content, and other characteristics of your text to speech output. For example, you can use SSML to define a paragraph, a sentence, a break or a pause, or silence. You can wrap text with event tags, like a bookmark or viseme, that your application can process later. A viseme is the visual description of a phoneme, the individual speech sounds, in spoken language. |
| 27 | +- [Choose the voice](speech-synthesis-markup-voice.md), language, name, style, and role. You can use multiple voices in a single SSML document. You can also adjust the emphasis, speaking rate, pitch, and volume. SSML can also insert prerecorded audio, such as a sound effect or a musical note. |
28 | 28 | - [Control pronunciation](speech-synthesis-markup-pronunciation.md) of the output audio. For example, you can use SSML with phonemes and a custom lexicon to improve pronunciation. You can also use SSML to define how a word or mathematical expression is pronounced.
|
29 | 29 |
|
30 |
| -## Use SSML |
| 30 | +## Ways to work with SSML |
| 31 | + |
| 32 | +SSML functionality is available in various tools that might fit your use case. |
31 | 33 |
|
32 | 34 | > [!IMPORTANT]
|
33 |
| -> You're billed for each character that's converted to speech, including punctuation. Although the SSML document itself is not billable, optional elements that are used to adjust how the text is converted to speech, like phonemes and pitch, are counted as billable characters. For more information, see [text to speech pricing notes](text-to-speech.md#pricing-note). |
| 35 | +> You're billed for each character that's converted to speech, including punctuation. Although the SSML document itself isn't billable, the service counts optional elements that you use to adjust how the text is converted to speech, like phonemes and pitch, as billable characters. For more information, see [Pricing note](text-to-speech.md#pricing-note). |
34 | 36 |
|
35 | 37 | You can use SSML in the following ways:
|
36 | 38 |
|
37 |
| -- [Audio Content Creation](https://aka.ms/audiocontentcreation) tool: Author plain text and SSML in Speech Studio: You can listen to the output audio and adjust the SSML to improve speech synthesis. For more information, see [Speech synthesis with the Audio Content Creation tool](how-to-audio-content-creation.md). |
38 |
| -- [Batch synthesis API](batch-synthesis.md): Provide SSML via the `inputs` property. |
39 |
| -- [Speech CLI](get-started-text-to-speech.md?pivots=programming-language-cli): Provide SSML via the `spx synthesize --ssml SSML` command line argument. |
40 |
| -- [Speech SDK](how-to-speech-synthesis.md#use-ssml-to-customize-speech-characteristics): Provide SSML via the "speak" SSML method. |
| 39 | +- [The Audio Content Creation](https://aka.ms/audiocontentcreation) tool lets you author plain text and SSML in Speech Studio. You can listen to the output audio and adjust the SSML to improve speech synthesis. For more information, see [Speech synthesis with the Audio Content Creation tool](how-to-audio-content-creation.md). |
| 40 | +- [The Batch synthesis API](batch-synthesis.md) accepts SSML via the `inputs` property. |
| 41 | +- [The Speech CLI](get-started-text-to-speech.md?pivots=programming-language-cli) accepts SSML via the `spx synthesize --ssml SSML` command line argument. |
| 42 | +- [The Speech SDK](how-to-speech-synthesis.md#use-ssml-to-customize-speech-characteristics) accepts SSML via the "speak" SSML method across the different supported languages. |
41 | 43 |
|
42 | 44 | ## Next steps
|
43 | 45 |
|
44 | 46 | - [SSML document structure and events](speech-synthesis-markup-structure.md)
|
45 | 47 | - [Voice and sound with SSML](speech-synthesis-markup-voice.md)
|
46 | 48 | - [Pronunciation with SSML](speech-synthesis-markup-pronunciation.md)
|
47 |
| -- [Language support: Voices, locales, languages](language-support.md?tabs=tts) |
| 49 | +- [Language and voice support for the Speech service](language-support.md?tabs=tts) |
0 commit comments