Skip to content

Commit d01124a

Browse files
authored
Merge pull request #107138 from IEvangelist/speechAriaForJessa
Updating Jessa, to Aria
2 parents 2949695 + fe819d8 commit d01124a

File tree

3 files changed

+29
-24
lines changed

3 files changed

+29
-24
lines changed

articles/cognitive-services/Speech-Service/language-support.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: conceptual
11-
ms.date: 03/09/2020
11+
ms.date: 03/11/2020
1212
ms.author: dapine
1313
ms.custom: seodec18
1414
---
@@ -90,15 +90,18 @@ For more information about regional availability, see [regions](regions.md#stand
9090
| Locale | Language | Gender | Full service name mapping | Short voice name |
9191
|---------|---------------------|--------|-------------------------------------------------------------------------|-------------------------|
9292
| `de-DE` | German (Germany) | Female | "Microsoft Server Speech Text to Speech Voice (de-DE, KatjaNeural)" | "de-DE-KatjaNeural" |
93-
| `en-US` | English (US) | Female | "Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural)" | "en-US-JessaNeural" |
93+
| `en-US` | English (US) | Female | "Microsoft Server Speech Text to Speech Voice (en-US, AriaNeural)" | "en-US-AriaNeural" |
9494
| `en-US` | English (US) | Male | "Microsoft Server Speech Text to Speech Voice (en-US, GuyNeural)" | "en-US-GuyNeural" |
9595
| `it-IT` | Italian (Italy) | Female | "Microsoft Server Speech Text to Speech Voice (it-IT, ElsaNeural)" | "it-IT-ElsaNeural" |
9696
| `pt-BR` | Portuguese (Brazil) | Female | "Microsoft Server Speech Text to Speech Voice (pt-BR, FranciscaNeural)" | "pt-BR-FranciscaNeural" |
9797
| `zh-CN` | Chinese (Mainland) | Female | "Microsoft Server Speech Text to Speech Voice (zh-CN, XiaoxiaoNeural)" | "zh-CN-XiaoxiaoNeural" |
9898

99+
> [!IMPORTANT]
100+
> The `en-US-JessaNeural` voice has changed to `en-US-AriaNeural`. If you were using "Jessa" before, convert over to "Aria".
101+
99102
To learn how you can configure and adjust neural voices, see [Speech synthesis markup language](speech-synthesis-markup.md#adjust-speaking-styles).
100103

101-
> [!NOTE]
104+
> [!TIP]
102105
> You can use either the full service name mapping or the short voice name in your speech synthesis requests.
103106
104107
### Standard voices
@@ -131,9 +134,8 @@ More than 75 standard voices are available in over 45 languages and locales, whi
131134
| | | Female | "Microsoft Server Speech Text to Speech Voice (en-IN, PriyaRUS)" | "en-IN-PriyaRUS" |
132135
| | | Male | "Microsoft Server Speech Text to Speech Voice (en-IN, Ravi, Apollo)" | "en-IN-Ravi-Apollo" |
133136
| `en-US` | English (US) | Female | "Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)" | "en-US-ZiraRUS" |
134-
| | | Female | "Microsoft Server Speech Text to Speech Voice (en-US, JessaRUS)" | "en-US-JessaRUS" |
137+
| | | Female | "Microsoft Server Speech Text to Speech Voice (en-US, AriaRUS)" | "en-US-AriaRUS" |
135138
| | | Male | "Microsoft Server Speech Text to Speech Voice (en-US, BenjaminRUS)" | "en-US-BenjaminRUS" |
136-
| | | Female | "Microsoft Server Speech Text to Speech Voice (en-US, Jessa24kRUS)" | "en-US-Jessa24kRUS" |
137139
| | | Male | "Microsoft Server Speech Text to Speech Voice (en-US, Guy24kRUS)" | "en-US-Guy24kRUS" |
138140
| `es-ES` | Spanish (Spain) | Female | "Microsoft Server Speech Text to Speech Voice (es-ES, Laura, Apollo)" | "es-ES-Laura-Apollo" |
139141
| | | Female | "Microsoft Server Speech Text to Speech Voice (es-ES, HelenaRUS)" | "es-ES-HelenaRUS" |
@@ -191,7 +193,10 @@ More than 75 standard voices are available in over 45 languages and locales, whi
191193

192194
**1** *ar-EG supports Modern Standard Arabic (MSA).*
193195

194-
> [!NOTE]
196+
> [!IMPORTANT]
197+
> The `en-US-Jessa` voice has changed to `en-US-Aria`. If you were using "Jessa" before, convert over to "Aria".
198+
199+
> [!TIP]
195200
> You can use either the full service name mapping or the short voice name in your speech synthesis requests.
196201
197202
### Customization

articles/cognitive-services/Speech-Service/speech-container-howto.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: conceptual
11-
ms.date: 03/09/2020
11+
ms.date: 03/10/2020
1212
ms.author: dapine
1313
---
1414

articles/cognitive-services/Speech-Service/speech-synthesis-markup.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: conceptual
11-
ms.date: 03/05/2020
11+
ms.date: 03/11/2020
1212
ms.author: dapine
1313
---
1414

@@ -77,11 +77,11 @@ The `voice` element is required. It is used to specify the voice that is used fo
7777
**Example**
7878

7979
> [!NOTE]
80-
> This example uses the `en-US-Jessa24kRUS` voice. For a complete list of supported voices, see [Language support](language-support.md#text-to-speech).
80+
> This example uses the `en-US-AriaRUS` voice. For a complete list of supported voices, see [Language support](language-support.md#text-to-speech).
8181
8282
```XML
8383
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
84-
<voice name="en-US-Jessa24kRUS">
84+
<voice name="en-US-AriaRUS">
8585
This is the text that is spoken.
8686
</voice>
8787
</speak>
@@ -172,11 +172,11 @@ speechConfig!.setPropertyTo(
172172

173173
```xml
174174
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
175-
<voice name="en-US-Jessa24kRUS">
175+
<voice name="en-US-AriaRUS">
176176
Good morning!
177177
</voice>
178178
<voice name="en-US-Guy24kRUS">
179-
Good morning to you too Jessa!
179+
Good morning to you too Aria!
180180
</voice>
181181
</speak>
182182
```
@@ -189,7 +189,7 @@ speechConfig!.setPropertyTo(
189189
By default, the text-to-speech service synthesizes text using a neutral speaking style for both standard and neural voices. With neural voices, you can adjust the speaking style to express cheerfulness, empathy, or sentiment with the `<mstts:express-as>` element. This is an optional element unique to the Speech service.
190190

191191
Currently, speaking style adjustments are supported for these neural voices:
192-
* `en-US-JessaNeural`
192+
* `en-US-AriaNeural`
193193
* `pt-BR-FranciscaNeural`
194194
* `zh-CN-XiaoxiaoNeural`
195195

@@ -211,7 +211,7 @@ Use this table to determine which speaking styles are supported for each neural
211211

212212
| Voice | Type | Description |
213213
|-------|------|-------------|
214-
| `en-US-JessaNeural` | `type="cheerful"` | Expresses an emotion that is positive and happy |
214+
| `en-US-AriaNeural` | `type="cheerful"` | Expresses an emotion that is positive and happy |
215215
| | `type="empathy"` | Expresses a sense of caring and understanding |
216216
| | `type="chat"` | Speak in a casual, relaxed tone |
217217
| | `type="newscast"` | Expresses a formal tone, similar to news broadcasts |
@@ -227,7 +227,7 @@ This SSML snippet illustrates how the `<mstts:express-as>` element is used to ch
227227
```xml
228228
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis"
229229
xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
230-
<voice name="en-US-JessaNeural">
230+
<voice name="en-US-AriaNeural">
231231
<mstts:express-as type="cheerful">
232232
That'd be just amazing!
233233
</mstts:express-as>
@@ -270,7 +270,7 @@ Use the `break` element to insert pauses (or breaks) between words, or prevent p
270270

271271
```xml
272272
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
273-
<voice name="en-US-Jessa24kRUS">
273+
<voice name="en-US-AriaRUS">
274274
Welcome to Microsoft Cognitive Services <break time="100ms" /> Text-to-Speech API.
275275
</voice>
276276
</speak>
@@ -295,7 +295,7 @@ The `s` element may contain text and the following elements: `audio`, `break`, `
295295

296296
```XML
297297
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
298-
<voice name="en-US-Jessa24kRUS">
298+
<voice name="en-US-AriaRUS">
299299
<p>
300300
<s>Introducing the sentence element.</s>
301301
<s>Used to mark individual sentences.</s>
@@ -331,15 +331,15 @@ Phonetic alphabets are composed of phones, which are made up of letters, numbers
331331

332332
```XML
333333
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
334-
<voice name="en-US-Jessa24kRUS">
334+
<voice name="en-US-AriaRUS">
335335
<s>His name is Mike <phoneme alphabet="ups" ph="JH AU"> Zhou </phoneme></s>
336336
</voice>
337337
</speak>
338338
```
339339

340340
```xml
341341
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
342-
<voice name="en-US-Jessa24kRUS">
342+
<voice name="en-US-AriaRUS">
343343
<phoneme alphabet="ipa" ph="t&#x259;mei&#x325;&#x27E;ou&#x325;"> tomato </phoneme>
344344
</voice>
345345
</speak>
@@ -487,7 +487,7 @@ Volume changes can be applied to standard voices at the word or sentence-level.
487487

488488
```xml
489489
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
490-
<voice name="en-US-Jessa24kRUS">
490+
<voice name="en-US-AriaRUS">
491491
<prosody volume="+20.00%">
492492
Welcome to Microsoft Cognitive Services Text-to-Speech API.
493493
</prosody>
@@ -518,7 +518,7 @@ Pitch changes can be applied to standard voices at the word or sentence-level. W
518518

519519
```xml
520520
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
521-
<voice name="en-US-Jessa24kRUS">
521+
<voice name="en-US-AriaRUS">
522522
<prosody contour="(80%,+20%) (90%,+30%)" >
523523
Good morning.
524524
</prosody>
@@ -569,7 +569,7 @@ The speech synthesis engine speaks the following example as "Your first request
569569

570570
```XML
571571
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
572-
<voice name="en-US-Jessa24kRUS">
572+
<voice name="en-US-AriaRUS">
573573
<p>
574574
Your <say-as interpret-as="ordinal"> 1st </say-as> request was for <say-as interpret-as="cardinal"> 1 </say-as> room
575575
on <say-as interpret-as="date" format="mdy"> 10/19/2010 </say-as>, with early arrival at <say-as interpret-as="time" format="hms12"> 12:35pm </say-as>.
@@ -607,7 +607,7 @@ Any audio included in the SSML document must meet these requirements:
607607

608608
```xml
609609
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
610-
<voice name="en-US-Jessa24kRUS">
610+
<voice name="en-US-AriaRUS">
611611
<p>
612612
<audio src="https://contoso.com/opinionprompt.wav"/>
613613
Thanks for offering your opinion. Please begin speaking after the beep.
@@ -647,7 +647,7 @@ Only one background audio file is allowed per SSML document. However, you can in
647647
```xml
648648
<speak version="1.0" xml:lang="en-US" xmlns:mstts="http://www.w3.org/2001/mstts">
649649
<mstts:backgroundaudio src="https://contoso.com/sample.wav" volume="0.7" fadein="3000" fadeout="4000"/>
650-
<voice name="Microsoft Server Speech Text to Speech Voice (en-US, Jessa24kRUS)">
650+
<voice name="Microsoft Server Speech Text to Speech Voice (en-US, AriaRUS)">
651651
The text provided in this document will be spoken over the background audio.
652652
</voice>
653653
</speak>

0 commit comments

Comments
 (0)