You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can specify the phonetic pronunciation of words using the Universal Phone Set (UPS) in a [structured text data](how-to-custom-speech-test-and-train.md#structured-text-data-for-training) file. The UPS is a machine-readable phone set that is based on the International Phonetic Set Alphabet (IPA). The IPA is a standard used by linguists world-wide.
15
15
16
-
UPS pronunciations consist of a string of UPS phones, each separated by whitespace. UPS phone labels are all defined using ASCII character strings.
16
+
UPS pronunciations consist of a string of UPS phonemes, each separated by whitespace. UPS phoneme labels are all defined using ASCII character strings.
17
17
18
18
For steps on implementing UPS, see [Structured text phonetic pronunciation](how-to-custom-speech-test-and-train.md#structured-text-data-for-training). Structured text phonetic pronunciation data is separate from [pronunciation data](how-to-custom-speech-test-and-train.md#pronunciation-data-for-training), and they cannot be used together. The first one is "sounds-like" or spoken-form data, and is input as a separate file, and trains the model what the spoken form sounds like
19
19
20
20
[Structured text phonetic pronunciation data](how-to-custom-speech-test-and-train.md#structured-text-data-for-training) is specified per syllable in a markdown file. Separately, [pronunciation data](how-to-custom-speech-test-and-train.md#pronunciation-data-for-training) it input on its own, and trains the model what the spoken form sounds like. You can either use a pronunciation data file on its own, or you can add pronunciation within a structured text data file. The Speech service doesn't support training a model with both of those datasets as input.
21
21
22
22
See the sections in this article for the Universal Phone Set for each locale.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/phonetic-sets/text-to-speech/en-us.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,8 @@
2
2
3
3
|Example 1 (onset for consonant, word-initial for vowel)|Example 2 (intervocalic for consonant, word-medial nucleus for vowel)|Example 3 (coda for consonant, word-final for vowel)|Comments|
4
4
|--|--|--|--|
5
-
| burger /b er **1** r - g ax r/ | falafel /f ax - l aa **1** - f ax l/ | guitar /g ih - t aa **1** r/ | The Speech service phone set puts stress after the vowel of the stressed syllable. |
6
-
| inopportune /ih **2** - n aa - p ax r - t uw 1 n/ | dissimilarity /d ih - s ih **2**- m ax - l eh 1 - r ax - t iy/ | workforce /w er 1 r k - f ao **2** r s/ | The Speech service phone set puts stress after the vowel of the sub-stressed syllable. |
5
+
|:::no-loc text="burger"::: /b er **1** r - g ax r/ |:::no-loc text="falafel"::: /f ax - l aa **1** - f ax l/ |:::no-loc text="guitar"::: /g ih - t aa **1** r/ | The Speech service phone set puts stress after the vowel of the stressed syllable. |
6
+
|:::no-loc text="inopportune"::: /ih **2** - n aa - p ax r - t uw 1 n/ |:::no-loc text="dissimilarity"::: /d ih - s ih **2**- m ax - l eh 1 - r ax - t iy/ |:::no-loc text="workforce"::: /w er 1 r k - f ao **2** r s/ | The Speech service phone set puts stress after the vowel of the sub-stressed syllable. |
0 commit comments