Skip to content

Commit cbece6d

Browse files
authored
Merge pull request #95278 from markamos/v-ammark-seo-2
[Cog Svcs] Metadata Description too long
2 parents 326c6d7 + b9596d0 commit cbece6d

10 files changed

+307
-308
lines changed

articles/cognitive-services/Speech-Service/call-center-transcription.md

Lines changed: 54 additions & 47 deletions
Large diffs are not rendered by default.

articles/cognitive-services/Speech-Service/conversation-transcription.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: What is Conversation Transcription (Preview) - Speech Service
33
titleSuffix: Azure Cognitive Services
4-
description: Conversation Transcription is a speech-to-text solution that combines speech recognition, speaker identification, and sentence attribution to each speaker (also known as diarization) to provide real-time and/or asynchronous transcription of any conversation. Conversation Transcription makes conversations inclusive for everyone, such as participants who are deaf and hard of hearing.
4+
description: Conversation Transcription is a speech-to-text solution that combines speech recognition, speaker identification, and sentence attribution to each speaker (also known as diarization) to provide real-time and/or asynchronous transcription of any conversation.
55
services: cognitive-services
66
author: markamos
77
manager: nitinme
Lines changed: 76 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
---
2-
title: "Human-labeled transcriptions guidelines - Speech Service"
2+
title: Human-labeled transcriptions guidelines - Speech Service
33
titleSuffix: Azure Cognitive Services
4-
description: "If you're looking to improve recognition accuracy, especially issues that are caused when words are deleted or incorrectly substituted, you'll want to use human-labeled transcriptions along with your audio data. What are human-labeled transcriptions? That's easy, they're word-by-word, verbatim transcriptions of an audio file."
4+
description: To improve speech recognition accuracy, such as when words are deleted or incorrectly substituted, you can use human-labeled transcriptions along with your audio data. Human-labeled transcriptions are word-by-word, verbatim transcriptions of an audio file.
55
services: cognitive-services
66
author: erhopf
77
manager: nitinme
8+
89
ms.service: cognitive-services
910
ms.subservice: speech-service
1011
ms.topic: conceptual
@@ -25,7 +26,7 @@ Human-labeled transcriptions for English audio must be provided as plain text, o
2526
Here are a few examples:
2627

2728
| Characters to avoid | Substitution | Notes |
28-
|---------------------|--------------|-------|
29+
| ------------------- | ------------ | ----- |
2930
| “Hello world” | "Hello world" | The opening and closing quotations marks have been substituted with appropriate ASCII characters. |
3031
| John’s day | John's day | The apostrophe has been substituted with the appropriate ASCII character. |
3132
| it was good—no, it was great! | it was good--no, it was great! | The em dash was substituted with two hyphens. |
@@ -34,88 +35,88 @@ Here are a few examples:
3435

3536
Text normalization is the transformation of words into a consistent format used when training a model. Some normalization rules are applied to text automatically, however, we recommend using these guidelines as you prepare your human-labeled transcription data:
3637

37-
* Write out abbreviations in words.
38-
* Write out non-standard numeric strings in words (such as accounting terms).
39-
* Non-alphabetic characters or mixed alphanumeric characters should be transcribed as pronounced.
40-
* Abbreviations that are pronounced as words shouldn't be edited (such as "radar", "laser", "RAM", or "NATO").
41-
* Write out abbreviations that are pronounced as separate letters with each letter separated by a space.
38+
- Write out abbreviations in words.
39+
- Write out non-standard numeric strings in words (such as accounting terms).
40+
- Non-alphabetic characters or mixed alphanumeric characters should be transcribed as pronounced.
41+
- Abbreviations that are pronounced as words shouldn't be edited (such as "radar", "laser", "RAM", or "NATO").
42+
- Write out abbreviations that are pronounced as separate letters with each letter separated by a space.
4243

4344
Here are a few examples of normalization that you should perform on the transcription:
4445

45-
| Original text | Text after normalization |
46-
|---------------|--------------------------|
47-
| Dr. Bruce Banner | Doctor Bruce Banner |
48-
| James Bond, 007 | James Bond, double oh seven |
49-
| Ke$ha | Kesha |
50-
| How long is the 2x4 | How long is the two by four |
46+
| Original text | Text after normalization |
47+
| --------------------------- | ------------------------------------- |
48+
| Dr. Bruce Banner | Doctor Bruce Banner |
49+
| James Bond, 007 | James Bond, double oh seven |
50+
| Ke$ha | Kesha |
51+
| How long is the 2x4 | How long is the two by four |
5152
| The meeting goes from 1-3pm | The meeting goes from one to three pm |
52-
| My blood type is O+ | My blood type is O positive |
53-
| Water is H20 | Water is H 2 O |
54-
| Play OU812 by Van Halen | Play O U 8 1 2 by Van Halen |
55-
| UTF-8 with BOM | U T F 8 with BOM |
53+
| My blood type is O+ | My blood type is O positive |
54+
| Water is H20 | Water is H 2 O |
55+
| Play OU812 by Van Halen | Play O U 8 1 2 by Van Halen |
56+
| UTF-8 with BOM | U T F 8 with BOM |
5657

5758
The following normalization rules are automatically applied to transcriptions:
5859

59-
* Use lowercase letters.
60-
* Remove all punctuation except apostrophes within words.
61-
* Expand numbers into words/spoken form, such as dollar amounts.
60+
- Use lowercase letters.
61+
- Remove all punctuation except apostrophes within words.
62+
- Expand numbers into words/spoken form, such as dollar amounts.
6263

6364
Here are a few examples of normalization automatically performed on the transcription:
6465

65-
| Original text | Text after normalization |
66-
|---------------|--------------------------|
67-
| "Holy cow!" said Batman. | holy cow said batman |
66+
| Original text | Text after normalization |
67+
| -------------------------------------- | --------------------------------- |
68+
| "Holy cow!" said Batman. | holy cow said batman |
6869
| "What?" said Batman's sidekick, Robin. | what said batman's sidekick robin |
69-
| Go get -em! | go get em |
70-
| I'm double-jointed | I'm double jointed |
71-
| 104 Elm Street | one oh four Elm street |
72-
| Tune to 102.7 | tune to one oh two point seven |
73-
| Pi is about 3.14 | pi is about three point one four |
74-
It costs $3.14| it costs three fourteen |
70+
| Go get -em! | go get em |
71+
| I'm double-jointed | I'm double jointed |
72+
| 104 Elm Street | one oh four Elm street |
73+
| Tune to 102.7 | tune to one oh two point seven |
74+
| Pi is about 3.14 | pi is about three point one four |
75+
| It costs \$3.14 | it costs three fourteen |
7576

7677
## Mandarin Chinese (zh-CN)
7778

7879
Human-labeled transcriptions for Mandarin Chinese audio must be UTF-8 encoded with a byte-order marker. Avoid the use of half-width punctuation characters. These characters can be included inadvertently when you prepare the data in a word-processing program or scrape data from web pages. If these characters are present, make sure to update them with the appropriate full-width substitution.
7980

8081
Here are a few examples:
8182

82-
| Characters to avoid | Substitution | Notes |
83-
|---------------------|--------------|-------|
83+
| Characters to avoid | Substitution | Notes |
84+
| ------------------- | -------------- | ----- |
8485
| "你好" | "你好" | The opening and closing quotations marks have been substituted with appropriate characters. |
85-
| 需要什么帮助? | 需要什么帮助? | The question mark has been substituted with appropriate character. |
86+
| 需要什么帮助? | 需要什么帮助?| The question mark has been substituted with appropriate character. |
8687

8788
### Text normalization for Mandarin Chinese
8889

8990
Text normalization is the transformation of words into a consistent format used when training a model. Some normalization rules are applied to text automatically, however, we recommend using these guidelines as you prepare your human-labeled transcription data:
9091

91-
* Write out abbreviations in words.
92-
* Write out numeric strings in spoken form.
92+
- Write out abbreviations in words.
93+
- Write out numeric strings in spoken form.
9394

9495
Here are a few examples of normalization that you should perform on the transcription:
9596

9697
| Original text | Text after normalization |
97-
|---------------|--------------------------|
98-
| 我今年21 | 我今年二十一 |
99-
| 3号楼504 | 三号 楼 五 零 四 |
98+
| ------------- | ------------------------ |
99+
| 我今年 21 | 我今年二十一 |
100+
| 3 号楼 504 | 三号 楼 五 零 四 |
100101

101102
The following normalization rules are automatically applied to transcriptions:
102103

103-
* Remove all punctuation
104-
* Expand numbers to spoken form
105-
* Convert full-width letters to half-width letters
106-
* Using uppercase letters for all English words
104+
- Remove all punctuation
105+
- Expand numbers to spoken form
106+
- Convert full-width letters to half-width letters
107+
- Using uppercase letters for all English words
107108

108109
Here are a few examples of normalization automatically performed on the transcription:
109110

110111
| Original text | Text after normalization |
111-
|---------------|--------------------------|
112+
| ------------- | ------------------------ |
112113
| 3.1415 | 三 点 一 四 一 五 |
113-
| ¥3.5 | 三 元 五 角 |
114-
| w f y z |W F Y Z |
115-
| 1992年8月8日 | 一 九 九 二 年 八 月 八 日 |
114+
| 3.5 | 三 元 五 角 |
115+
| w f y z | W F Y Z |
116+
| 1992 年 8 月 8 日 | 一 九 九 二 年 八 月 八 日 |
116117
| 你吃饭了吗? | 你 吃饭 了 吗 |
117-
| 下午5:00的航班 | 下午 五点 的 航班 |
118-
| 我今年21岁 | 我 今年 二十 一 岁 |
118+
| 下午 5:00 的航班 | 下午 五点 的 航班 |
119+
| 我今年 21 岁 | 我 今年 二十 一 岁 |
119120

120121
## German (de-DE) and other languages
121122

@@ -125,42 +126,42 @@ Human-labeled transcriptions for German audio (and other non-English or Mandarin
125126

126127
Text normalization is the transformation of words into a consistent format used when training a model. Some normalization rules are applied to text automatically, however, we recommend using these guidelines as you prepare your human-labeled transcription data:
127128

128-
* Write decimal points as "," and not ".".
129-
* Write time separators as ":" and not "." (for example: 12:00 Uhr).
130-
* Abbreviations such as "ca." aren't replaced. We recommend that you use the full spoken form.
131-
* The four main mathematical operators (+, -, \*, and /) are removed. We recommend replacing them with the written form: "plus," "minus," "mal," and "geteilt."
132-
* Comparison operators are removed (=, <, and >). We recommend replacing them with "gleich," "kleiner als," and "grösser als."
133-
* Write fractions, such as 3/4, in written form (for example: "drei viertel" instead of 3/4).
134-
* Replace the "€" symbol with its written form "Euro."
129+
- Write decimal points as "," and not ".".
130+
- Write time separators as ":" and not "." (for example: 12:00 Uhr).
131+
- Abbreviations such as "ca." aren't replaced. We recommend that you use the full spoken form.
132+
- The four main mathematical operators (+, -, \*, and /) are removed. We recommend replacing them with the written form: "plus," "minus," "mal," and "geteilt."
133+
- Comparison operators are removed (=, <, and >). We recommend replacing them with "gleich," "kleiner als," and "grösser als."
134+
- Write fractions, such as 3/4, in written form (for example: "drei viertel" instead of 3/4).
135+
- Replace the "€" symbol with its written form "Euro."
135136

136137
Here are a few examples of normalization that you should perform on the transcription:
137138

138-
| Original text | Text after user normalization | Text after system normalization |
139-
|---------------|-------------------------------|---------------------------------|
140-
| Es ist 12.23 Uhr | Es ist 12:23 Uhr | es ist zwölf uhr drei und zwanzig uhr |
141-
| {12.45} | {12,45} | zwölf komma vier fünf |
142-
| 2 + 3 - 4 | 2 plus 3 minus 4 | zwei plus drei minus vier |
139+
| Original text | Text after user normalization | Text after system normalization |
140+
| ---------------- | ----------------------------- | ------------------------------------- |
141+
| Es ist 12.23 Uhr | Es ist 12:23 Uhr | es ist zwölf uhr drei und zwanzig uhr |
142+
| {12.45} | {12,45} | zwölf komma vier fünf |
143+
| 2 + 3 - 4 | 2 plus 3 minus 4 | zwei plus drei minus vier |
143144

144145
The following normalization rules are automatically applied to transcriptions:
145146

146-
* Use lowercase letters for all text.
147-
* Remove all punctuation, including various types of quotation marks ("test", 'test', "test„, and «test» are OK).
148-
* Discard rows with any special characters from this set: ¢ ¤ ¥ ¦ § © ª ¬ ® ° ± ² µ × ÿ ج¬.
149-
* Expand numbers to spoken form, including dollar or Euro amounts.
150-
* Accept umlauts only for a, o, and u. Others will be replaced by "th" or be discarded.
147+
- Use lowercase letters for all text.
148+
- Remove all punctuation, including various types of quotation marks ("test", 'test', "test„, and «test» are OK).
149+
- Discard rows with any special characters from this set: ¢ ¤ ¥ ¦ § © ª ¬ ® ° ± ² µ × ÿ ج¬.
150+
- Expand numbers to spoken form, including dollar or Euro amounts.
151+
- Accept umlauts only for a, o, and u. Others will be replaced by "th" or be discarded.
151152

152153
Here are a few examples of normalization automatically performed on the transcription:
153154

154-
| Original text | Text after normalization |
155-
|---------------|--------------------------|
156-
| Frankfurter Ring | frankfurter ring |
157-
| ¡Eine Frage! | eine frage |
158-
| wir, haben | wir haben |
155+
| Original text | Text after normalization |
156+
| ---------------- | ------------------------ |
157+
| Frankfurter Ring | frankfurter ring |
158+
| ¡Eine Frage! | eine frage |
159+
| wir, haben | wir haben |
159160

160161
## Next Steps
161162

162-
* [Prepare and test your data](how-to-custom-speech-test-data.md)
163-
* [Inspect your data](how-to-custom-speech-inspect-data.md)
164-
* [Evaluate your data](how-to-custom-speech-evaluate-data.md)
165-
* [Train your model](how-to-custom-speech-train-model.md)
166-
* [Deploy your model](how-to-custom-speech-deploy-model.md)
163+
- [Prepare and test your data](how-to-custom-speech-test-data.md)
164+
- [Inspect your data](how-to-custom-speech-inspect-data.md)
165+
- [Evaluate your data](how-to-custom-speech-evaluate-data.md)
166+
- [Train your model](how-to-custom-speech-train-model.md)
167+
- [Deploy your model](how-to-custom-speech-deploy-model.md)

articles/cognitive-services/Speech-Service/how-to-custom-speech-inspect-data.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: "Inspect data quality for Custom Speech - Speech Service"
2+
title: Inspect data quality for Custom Speech - Speech Service
33
titleSuffix: Azure Cognitive Services
4-
description: "Custom Speech provides tools that allow you to visually inspect the recognition quality of a model by comparing audio data with the corresponding recognition result. From the Custom Speech portal, you can play back uploaded audio and determine if the provided recognition result is correct. This tool allows you to quickly inspect quality of our baseline speech-to-text model or a trained custom model without having to transcribe any audio data."
4+
description: Custom Speech provides tools that allow you to visually inspect the recognition quality of a model by comparing audio data with the corresponding recognition result. You can play back uploaded audio and determine if the provided recognition result is correct.
55
services: cognitive-services
66
author: erhopf
77
manager: nitinme
@@ -38,18 +38,18 @@ After a test has been successfully created, you can compare the models side by s
3838

3939
## Side-by-side model comparisons
4040

41-
When the test status is *Succeeded*, click in the test item name to see details of the test. This detail page lists all the utterances in your dataset, indicating the recognition results of the two models alongside the transcription from the submitted dataset.
41+
When the test status is _Succeeded_, click in the test item name to see details of the test. This detail page lists all the utterances in your dataset, indicating the recognition results of the two models alongside the transcription from the submitted dataset.
4242

4343
To help inspect the side-by-side comparison, you can toggle various error types including insertion, deletion, and substitution. By listening to the audio and comparing recognition results in each column (showing human-labeled transcription and the results of two speech-to-text models), you can decide which model meets your needs and where improvements are needed.
4444

45-
Inspecting quality testing is useful to validate if the quality of a speech recognition endpoint is enough for an application. For an objective measure of accuracy, requiring transcribed audio, follow the instructions found in [Evaluate Accuracy](how-to-custom-speech-evaluate-data.md).
45+
Inspecting quality testing is useful to validate if the quality of a speech recognition endpoint is enough for an application. For an objective measure of accuracy, requiring transcribed audio, follow the instructions found in [Evaluate Accuracy](how-to-custom-speech-evaluate-data.md).
4646

4747
## Next steps
4848

49-
* [Evaluate your data](how-to-custom-speech-evaluate-data.md)
50-
* [Train your model](how-to-custom-speech-train-model.md)
51-
* [Deploy your model](how-to-custom-speech-deploy-model.md)
49+
- [Evaluate your data](how-to-custom-speech-evaluate-data.md)
50+
- [Train your model](how-to-custom-speech-train-model.md)
51+
- [Deploy your model](how-to-custom-speech-deploy-model.md)
5252

5353
## Additional resources
5454

55-
* [Prepare test data for Custom Speech](how-to-custom-speech-test-data.md)
55+
- [Prepare test data for Custom Speech](how-to-custom-speech-test-data.md)

0 commit comments

Comments
 (0)