Skip to content

Commit a5e7228

Browse files
authored
Update speech-synthesis-markup.md
1 parent dd0f188 commit a5e7228

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

articles/cognitive-services/Speech-Service/speech-synthesis-markup.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ The `speak` element is the root element. It's *required* for all SSML documents.
5353

5454
**Attributes**
5555

56-
| Attribute | Description | Required/Optional |
56+
| Attribute | Description | Required or optional |
5757
|-----------|-------------|---------------------|
5858
| `version` | Indicates the version of the SSML specification used to interpret the document markup. The current version is 1.0. | Required |
5959
| `xml:lang` | Specifies the language of the root document. The value can contain a lowercase, two-letter language code, for example, `en`. Or the value can contain the language code and uppercase country/region, for example, `en-US`. | Required |
@@ -73,7 +73,7 @@ The `voice` element is required. It's used to specify the voice that's used for
7373

7474
**Attribute**
7575

76-
| Attribute | Description | Required/Optional |
76+
| Attribute | Description | Required or optional |
7777
|-----------|-------------|---------------------|
7878
| `name` | Identifies the voice used for text-to-speech output. For a complete list of supported voices, see [Language support](language-support.md#text-to-speech). | Required |
7979

@@ -96,7 +96,7 @@ Within the `speak` element, you can specify multiple voices for text-to-speech o
9696

9797
**Attribute**
9898

99-
| Attribute | Description | Required/Optional |
99+
| Attribute | Description | Required or optional |
100100
|-----------|-------------|---------------------|
101101
| `name` | Identifies the voice used for text-to-speech output. For a complete list of supported voices, see [Language support](language-support.md#text-to-speech). | Required |
102102

@@ -123,7 +123,7 @@ Styles, style degree, and roles are supported for a subset of neural voices. If
123123
- The [Voice List API](rest-text-to-speech.md#get-a-list-of-voices).
124124
- The code-free [Audio Content Creation](https://aka.ms/audiocontentcreation) portal.
125125

126-
| Attribute | Description | Required/Optional |
126+
| Attribute | Description | Required or optional |
127127
|-----------|-------------|---------------------|
128128
| `style` | Specifies the speaking style. Speaking styles are voice specific. | Required if adjusting the speaking style for a neural voice. If you're using `mstts:express-as`, the style must be provided. If an invalid value is provided, this element is ignored. |
129129
| `styledegree` | Specifies the intensity of the speaking style. **Accepted values**: 0.01 to 2 inclusive. The default value is 1, which means the predefined style intensity. The minimum unit is 0.01, which results in a slight tendency for the target style. A value of 2 results in a doubling of the default style intensity. | Optional. If you don't set the `style` attribute, the `styledegree` attribute is ignored. Speaking style degree adjustments are supported for Chinese (Mandarin, Simplified) neural voices.|
@@ -271,7 +271,7 @@ Speaking language adjustments are only supported for the `en-US-JennyMultilingua
271271

272272
**Attribute**
273273

274-
| Attribute | Description | Required/Optional |
274+
| Attribute | Description | Required or optional |
275275
|-----------|-------------|---------------------|
276276
| `lang` | Specifies the speaking languages. Speaking different languages are voice specific. | Required if adjusting the speaking language for a neural voice. If you're using `lang xml:lang`, the locale must be provided. |
277277

@@ -329,7 +329,7 @@ Use the `break` element to insert pauses or breaks between words. You can also u
329329

330330
**Attributes**
331331

332-
| Attribute | Description | Required/Optional |
332+
| Attribute | Description | Required or optional |
333333
|-----------|-------------|---------------------|
334334
| `strength` | Specifies the relative duration of a pause by using one of the following values:<ul><li>none</li><li>x-weak</li><li>weak</li><li>medium (default)</li><li>strong</li><li>x-strong</li></ul> | Optional |
335335
| `time` | Specifies the absolute duration of a pause in seconds or milliseconds (ms). This value should be set less than 5,000 ms. Examples of valid values are `2s` and `500ms`. | Optional |
@@ -369,7 +369,7 @@ Use the `mstts:silence` element to insert pauses before or after text, or betwee
369369

370370
**Attributes**
371371

372-
| Attribute | Description | Required/Optional |
372+
| Attribute | Description | Required or optional |
373373
|-----------|-------------|---------------------|
374374
| `type` | Specifies the location of silence to be added: <ul><li>`Leading` – At the beginning of text </li><li>`Tailing` – At the end of text </li><li>`Sentenceboundary` – Between adjacent sentences </li></ul> | Required |
375375
| `Value` | Specifies the absolute duration of a pause in seconds or milliseconds. This value should be set less than 5,000 ms. Examples of valid values are `2s` and `500ms`. | Required |
@@ -437,7 +437,7 @@ Phonetic alphabets are composed of phones, which are made up of letters, numbers
437437

438438
**Attributes**
439439

440-
| Attribute | Description | Required/Optional |
440+
| Attribute | Description | Required or optional |
441441
|-----------|-------------|---------------------|
442442
| `alphabet` | Specifies the phonetic alphabet to use when you synthesize the pronunciation of the string in the `ph` attribute. The string that specifies the alphabet must be specified in lowercase letters. The following options are the possible alphabets that you can specify:<ul><li>`ipa` &ndash; [International Phonetic Alphabet (IPA)](speech-ssml-phonetic-sets.md#speech-service-phonetic-alphabet)</li><li>`sapi` &ndash; [Speech service phonetic alphabet ](speech-ssml-phonetic-sets.md#speech-service-phonetic-alphabet)</li><li>`ups` &ndash; [Universal Phone Set](https://documentation.help/Microsoft-Speech-Platform-SDK-11/17509a49-cae7-41f5-b61d-07beaae872ea.htm)</li></ul><br>The alphabet applies only to the `phoneme` in the element.| Optional |
443443
| `ph` | A string containing phones that specify the pronunciation of the word in the `phoneme` element. If the specified string contains unrecognized phones, text-to-speech rejects the entire SSML document and produces none of the speech output specified in the document. | Required if using phonemes |
@@ -486,7 +486,7 @@ The custom lexicon currently supports UTF-8 encoding.
486486

487487
**Attribute**
488488

489-
| Attribute | Description | Required/Optional |
489+
| Attribute | Description | Required or optional |
490490
|-----------|-------------------------------------------|---------------------|
491491
| `uri` | The address of the external PLS document | Required |
492492

@@ -625,7 +625,7 @@ Because prosodic attribute values can vary over a wide range, the speech recogni
625625

626626
**Attributes**
627627

628-
| Attribute | Description | Required/Optional |
628+
| Attribute | Description | Required or optional |
629629
|-----------|-------------|---------------------|
630630
| `pitch` | Indicates the baseline pitch for the text. You can express the pitch as:<ul><li>An absolute value, expressed as a number followed by "Hz" (Hertz). For example, `<prosody pitch="600Hz">some text</prosody>`.</li><li>A relative value, expressed as a number preceded by "+" or "-" and followed by "Hz" or "st" that specifies an amount to change the pitch. For example: `<prosody pitch="+80Hz">some text</prosody>` or `<prosody pitch="-2st">some text</prosody>`. The "st" indicates the change unit is semitone, which is half of a tone (a half step) on the standard diatonic scale.</li><li>A constant value:<ul><li>x-low</li><li>low</li><li>medium</li><li>high</li><li>x-high</li><li>default</li></ul></li></ul> | Optional |
631631
| `contour` |Contour now supports neural voice. Contour represents changes in pitch. These changes are represented as an array of targets at specified time positions in the speech output. Each target is defined by sets of parameter pairs. For example: <br/><br/>`<prosody contour="(0%,+20Hz) (10%,-2st) (40%,+10Hz)">`<br/><br/>The first value in each set of parameters specifies the location of the pitch change as a percentage of the duration of the text. The second value specifies the amount to raise or lower the pitch by using a relative value or an enumeration value for pitch (see `pitch`). | Optional |
@@ -707,7 +707,7 @@ The `say-as` element is optional. It indicates the content type, such as number
707707

708708
**Attributes**
709709

710-
| Attribute | Description | Required/Optional |
710+
| Attribute | Description | Required or optional |
711711
|-----------|-------------|---------------------|
712712
| `interpret-as` | Indicates the content type of an element's text. For a list of types, see the following table. | Required |
713713
| `format` | Provides additional information about the precise formatting of the element's text for content types that might have ambiguous formats. SSML defines formats for content types that use them. See the following table. | Optional |
@@ -766,7 +766,7 @@ Any audio included in the SSML document must meet these requirements:
766766

767767
**Attribute**
768768

769-
| Attribute | Description | Required/Optional |
769+
| Attribute | Description | Required or optional |
770770
|-----------|-----------------------------------------------|------------------------------------------------------------|
771771
| `src` | Specifies the location/URL of the audio file. | Required if using the audio element in your SSML document. |
772772

@@ -802,7 +802,7 @@ Only one background audio file is allowed per SSML document. You can intersperse
802802

803803
**Attributes**
804804

805-
| Attribute | Description | Required/Optional |
805+
| Attribute | Description | Required or optional |
806806
|-----------|-------------|---------------------|
807807
| `src` | Specifies the location/URL of the background audio file. | Required if using background audio in your SSML document |
808808
| `volume` | Specifies the volume of the background audio file. **Accepted values**: `0` to `100` inclusive. The default value is `1`. | Optional |
@@ -832,7 +832,7 @@ You can use the `bookmark` element to insert custom markers in SSML to get the o
832832

833833
**Attribute**
834834

835-
| Attribute | Description | Required/Optional |
835+
| Attribute | Description | Required or optional |
836836
|-----------|-----------------------------------------------|------------------------------------------------------------|
837837
| `mark` | Specifies the reference text of the `bookmark` element. | Required |
838838

0 commit comments

Comments
 (0)