You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/speech-synthesis-markup-voice.md
+15-27Lines changed: 15 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,11 +24,14 @@ At least one `voice` element must be specified within each SSML [speak](speech-s
24
24
25
25
You can include multiple `voice` elements in a single SSML document. Each `voice` element can specify a different voice. You can also use the same voice multiple times with different settings, such as when you [change the silence duration](speech-synthesis-markup-structure.md#add-silence) between sentences.
26
26
27
+
Regarding the `effect` attribute of the `voice` element, it is an audio effect processor. This attribute is used to enhance the auditory quality of the synthesized speech output from various device. In a practical environment, the audience's auditory experience may be degraded due to the distortion of playback from various devices in different scenarios. For example, the synthesized speech from car speaker may sound dull and muffled due to environmental factors such as speaker response, room reverberation, and background noise. The driver usually has to turn up the volume to hear more clearly. In such a case, the `effect` processor can make the sound clearer by compensating the distortion of playback without any manual operation.
28
+
27
29
Usage of the `voice` element's attributes are described in the following table.
28
30
29
31
| Attribute | Description | Required or optional |
30
32
| ---------- | ---------- | ---------- |
31
33
|`name`| The voice used for text-to-speech output. For a complete list of supported prebuilt voices, see [Language support](language-support.md?tabs=stt-tts).| Required|
34
+
|`effect`|A voice-specific effect processor. You can choose a specific value according to the corresponding scenarios. The following values are supported:<br/><ul><li>`eq_car` – Optimize the auditory experience when providing high-fidelity speech in the car scenarios, such as small cars, buses, and other enclosed small/medium vehicles.</li><li>`eq_telecomhp8k` – Optimize the auditory experience in telecom or telephone scenarios. This feature is only designed for narrowband speech (sampling rate = 8kHz). If the sample rate of the output speech is not 8kHz, the auditory quality of the output speech isn't guaranteed even with this attribute. We recommend that you convert the sample rate of the output speech to 8kHz to get a better result with this attribute in telecom scenarios. </li></ul><br/>If the value is missing or invalid, the `effect` attribute will be ignored and the service will use the default neutral speech.| Optional |
32
35
33
36
### Voice examples
34
37
@@ -77,6 +80,18 @@ This example uses a custom voice named "my-custom-voice".
77
80
</speak>
78
81
```
79
82
83
+
#### Audio effect example
84
+
85
+
You use the `effect` attribute to optimize the auditory experience for different voices. The following SSML example uses the `effect` attribute with the configuration in car scenarios.
By default, neural voices have a neutral speaking style. You can adjust the speaking style, style degree, and role at the sentence level.
@@ -441,33 +456,6 @@ The supported values for attributes of the `mstts:backgroundaudio` element were
441
456
</speak>
442
457
```
443
458
444
-
## Audio effect
445
-
446
-
The `effect` element is an effect processor that is used to enhance the auditory quality of the synthesized speech output from various device. In a practical environment, the audience's auditory experience may be degraded due to the distortion of playback from various devices in different scenarios. For example, the synthesized speech from car speaker may sound dull and muffled due to environmental factors such as speaker response, room reverberation, and background noise. The driver usually has to turn up the volume to hear more clearly. In such a case, the `effect` processor can make the sound clearer by compensating the distortion of playback without any manual operation.
447
-
448
-
You can configure the `effect` element's attributes within the `voice` element to optimize the auditory experience of synthesized speech. For example, if you are in a car scenario, you can use the value `eq_car` to make the synthesized speech from the car speaker clearer.
449
-
450
-
451
-
Usage of the `effect` element's attributes are described in the following table.
452
-
453
-
| Attribute | Description | Required or optional |
454
-
| ---------- | ---------- | ---------- |
455
-
|`effect`|A voice-specific effect processor. You can choose a specific value according to the corresponding scenarios. The following values are supported:<br/><ul><li>`eq_car` – Optimize the auditory experience when providing high-fidelity speech in the car scenarios, such as small cars, buses, and other enclosed small/medium vehicles.</li><li>`eq_telecomhp8k` – Optimize the auditory experience in telecom or telephone scenarios. This feature is only designed for narrowband speech (sampling rate = 8kHz). If the sample rate of the output speech is not 8kHz, the auditory quality of the output speech isn't guaranteed even with this attribute. We recommend that you convert the sample rate of the output speech to 8kHz to get a better result with this attribute in telecom scenarios. </li></ul><br/>If the value is missing or invalid, the `effect` element will be ignored and the service will use the default neutral speech.| Required |
456
-
457
-
### Audio effect examples
458
-
459
-
The supported values for attributes of the `effect` element were [described previously](#audio-effect).
460
-
461
-
You use the `effect` element within the `voice` element to optimize the auditory experience for different voices. The following SSML example uses the `effect` element with the configuration in car scenarios.
0 commit comments