Skip to content

Commit 1dedaca

Browse files
committed
Merge branch 'main' of https://github.com/MicrosoftDocs/azure-docs-pr into perfUpdate
2 parents 3165f9c + 5a87b03 commit 1dedaca

File tree

78 files changed

+217
-277
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+217
-277
lines changed

articles/ai-services/speech-service/custom-speech-overview.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: overview
9-
ms.date: 1/18/2024
9+
ms.date: 1/19/2024
1010
ms.author: eur
1111
ms.custom: contperf-fy21q2, references_regions
1212
---
@@ -17,7 +17,9 @@ With Custom Speech, you can evaluate and improve the accuracy of speech recognit
1717

1818
Out of the box, speech recognition utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing various common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md?tabs=stt) is used by default. The base model works well in most speech recognition scenarios.
1919

20-
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions.
20+
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions.
21+
22+
You can also train a model with structured text when the data follows a pattern, to specify custom pronunciations, and to customize display text formatting with custom inverse text normalization, custom rewrite, and custom profanity filtering.
2123

2224
## How does it work?
2325

@@ -27,7 +29,7 @@ With Custom Speech, you can upload your own data, test and train a custom model,
2729

2830
Here's more information about the sequence of steps shown in the previous diagram:
2931

30-
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" title="Create a Speech resource" target="_blank">Speech resource</a> that you create in the Azure portal. If you train a custom model with audio data, choose a Speech resource region with dedicated hardware for training audio data. See footnotes in the [regions](regions.md#speech-service) table for more information.
32+
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" title="Create a Speech resource" target="_blank">Speech resource</a> that you create in the Azure portal. If you train a custom model with audio data, choose a Speech resource region with dedicated hardware for training audio data. For more information, see footnotes in the [regions](regions.md#speech-service) table.
3133
1. [Upload test data](./how-to-custom-speech-upload-data.md). Upload test data to evaluate the speech to text offering for your applications, tools, and products.
3234
1. [Test recognition quality](how-to-custom-speech-inspect-data.md). Use the [Speech Studio](https://aka.ms/speechstudio/customspeech) to play back uploaded audio and inspect the speech recognition quality of your test data.
3335
1. [Test model quantitatively](how-to-custom-speech-evaluate-data.md). Evaluate and improve the accuracy of the speech to text model. The Speech service provides a quantitative word error rate (WER), which you can use to determine if more training is required.
@@ -40,7 +42,7 @@ Here's more information about the sequence of steps shown in the previous diagra
4042
4143
## Responsible AI
4244

43-
An AI system includes not only the technology, but also the people who use it, the people who will be affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
45+
An AI system includes not only the technology, but also the people who use it, the people who are affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
4446

4547
* [Transparency note and use cases](/legal/cognitive-services/speech-service/speech-to-text/transparency-note?context=/azure/ai-services/speech-service/context/context)
4648
* [Characteristics and limitations](/legal/cognitive-services/speech-service/speech-to-text/characteristics-and-limitations?context=/azure/ai-services/speech-service/context/context)

articles/ai-services/speech-service/how-to-custom-speech-continuous-integration-continuous-deployment.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: nitinme
66
manager: cmayomsft
77
ms.service: azure-ai-speech
88
ms.topic: how-to
9-
ms.date: 05/08/2022
9+
ms.date: 1/19/2024
1010
ms.author: nitinme
1111
---
1212

@@ -24,22 +24,22 @@ Custom CI/CD solutions are possible, but for a robust, pre-built solution, use t
2424

2525
The purpose of these workflows is to ensure that each Custom Speech model has better recognition accuracy than the previous build. If the updates to the testing and/or training data improve the accuracy, these workflows create a new Custom Speech endpoint.
2626

27-
Git servers such as GitHub and Azure DevOps can run automated workflows when specific Git events happen, such as merges or pull requests. For example, a CI workflow can be triggered when updates to testing data are pushed to the *main* branch. Different Git Servers will have different tooling, but will allow scripting command-line interface (CLI) commands so that they can execute on a build server.
27+
Git servers such as GitHub and Azure DevOps can run automated workflows when specific Git events happen, such as merges or pull requests. For example, a CI workflow can be triggered when updates to testing data are pushed to the *main* branch. Different Git Servers have different tooling, but allow scripting command-line interface (CLI) commands so that they can execute on a build server.
2828

29-
Along the way, the workflows should name and store data, tests, test files, models, and endpoints such that they can be traced back to the commit or version they came from. It is also helpful to name these assets so that it is easy to see which were created after updating testing data versus training data.
29+
Along the way, the workflows should name and store data, tests, test files, models, and endpoints such that they can be traced back to the commit or version they came from. It's also helpful to name these assets so that it's easy to see which were created after updating testing data versus training data.
3030

3131
### CI workflow for testing data updates
3232

33-
The principal purpose of the CI/CD workflows is to build a new model using the training data, and to test that model using the testing data to establish whether the [Word Error Rate](how-to-custom-speech-evaluate-data.md#evaluate-word-error-rate-wer) (WER) has improved compared to the previous best-performing model (the "benchmark model"). If the new model performs better, it becomes the new benchmark model against which future models are compared.
33+
The principal purpose of the CI/CD workflows is to build a new model using the training data, and to test that model using the testing data to establish whether the [Word Error Rate](how-to-custom-speech-evaluate-data.md#evaluate-word-error-rate-wer) (WER) improved compared to the previous best-performing model (the "benchmark model"). If the new model performs better, it becomes the new benchmark model against which future models are compared.
3434

35-
The CI workflow for testing data updates should retest the current benchmark model with the updated test data to calculate the revised WER. This ensures that when the WER of a new model is compared to the WER of the benchmark, both models have been tested against the same test data and you're comparing like with like.
35+
The CI workflow for testing data updates should retest the current benchmark model with the updated test data to calculate the revised WER. This ensures that when the WER of a new model is compared to the WER of the benchmark, both models were tested against the same test data and you're comparing like with like.
3636

3737
This workflow should trigger on updates to testing data and:
3838

3939
- Test the benchmark model against the updated testing data.
4040
- Store the test output, which contains the WER of the benchmark model, using the updated data.
4141
- The WER from these tests will become the new benchmark WER that future models must beat.
42-
- The CD workflow does not execute for updates to testing data.
42+
- The CD workflow doesn't execute for updates to testing data.
4343

4444
### CI workflow for training data updates
4545

@@ -51,7 +51,7 @@ This workflow should trigger on updates to training data and:
5151
- Test the new model against the testing data.
5252
- Store the test output, which contains the WER.
5353
- Compare the WER from the new model to the WER from the benchmark model.
54-
- If the WER does not improve, stop the workflow.
54+
- If the WER doesn't improve, stop the workflow.
5555
- If the WER improves, execute the CD workflow to create a Custom Speech endpoint.
5656

5757
### CD workflow

articles/ai-services/speech-service/how-to-custom-speech-create-project.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: how-to
9-
ms.date: 11/29/2022
9+
ms.date: 1/19/2024
1010
ms.author: eur
1111
zone_pivot_groups: speech-studio-cli-rest
1212
---
@@ -30,7 +30,7 @@ To create a Custom Speech project, follow these steps:
3030
1. Select **Custom speech** > **Create a new project**.
3131
1. Follow the instructions provided by the wizard to create your project.
3232

33-
Select the new project by name or select **Go to project**. You will see these menu items in the left panel: **Speech datasets**, **Train custom models**, **Test models**, and **Deploy models**.
33+
Select the new project by name or select **Go to project**. You'll see these menu items in the left panel: **Speech datasets**, **Train custom models**, **Test models**, and **Deploy models**.
3434

3535
::: zone-end
3636

@@ -39,7 +39,7 @@ Select the new project by name or select **Go to project**. You will see these m
3939
To create a project, use the `spx csr project create` command. Construct the request parameters according to the following instructions:
4040

4141
- Set the required `language` parameter. The locale of the project and the contained datasets should be the same. The locale can't be changed later. The Speech CLI `language` parameter corresponds to the `locale` property in the JSON request and response.
42-
- Set the required `name` parameter. This is the name that will be displayed in the Speech Studio. The Speech CLI `name` parameter corresponds to the `displayName` property in the JSON request and response.
42+
- Set the required `name` parameter. This is the name that is displayed in the Speech Studio. The Speech CLI `name` parameter corresponds to the `displayName` property in the JSON request and response.
4343

4444
Here's an example Speech CLI command that creates a project:
4545

@@ -88,7 +88,7 @@ spx help csr project
8888
To create a project, use the [Projects_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Projects_Create) operation of the [Speech to text REST API](rest-speech-to-text.md). Construct the request body according to the following instructions:
8989

9090
- Set the required `locale` property. This should be the locale of the contained datasets. The locale can't be changed later.
91-
- Set the required `displayName` property. This is the project name that will be displayed in the Speech Studio.
91+
- Set the required `displayName` property. This is the project name that is displayed in the Speech Studio.
9292

9393
Make an HTTP POST request using the URI as shown in the following [Projects_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Projects_Create) example. Replace `YourSubscriptionKey` with your Speech resource key, replace `YourServiceRegion` with your Speech resource region, and set the request body properties as previously described.
9494

@@ -137,13 +137,13 @@ There are a few approaches to using Custom Speech models:
137137
- A custom model augments the base model to include domain-specific vocabulary shared across all areas of the custom domain.
138138
- Multiple custom models can be used when the custom domain has multiple areas, each with a specific vocabulary.
139139

140-
One recommended way to see if the base model will suffice is to analyze the transcription produced from the base model and compare it with a human-generated transcript for the same audio. You can compare the transcripts and obtain a [word error rate (WER)](how-to-custom-speech-evaluate-data.md#evaluate-word-error-rate-wer) score. If the WER score is high, training a custom model to recognize the incorrectly identified words is recommended.
140+
One recommended way to see if the base model suffices is to analyze the transcription produced from the base model and compare it with a human-generated transcript for the same audio. You can compare the transcripts and obtain a [word error rate (WER)](how-to-custom-speech-evaluate-data.md#evaluate-word-error-rate-wer) score. If the WER score is high, training a custom model to recognize the incorrectly identified words is recommended.
141141

142142
Multiple models are recommended if the vocabulary varies across the domain areas. For instance, Olympic commentators report on various events, each associated with its own vernacular. Because each Olympic event vocabulary differs significantly from others, building a custom model specific to an event increases accuracy by limiting the utterance data relative to that particular event. As a result, the model doesn't need to sift through unrelated data to make a match. Regardless, training still requires a decent variety of training data. Include audio from various commentators who have different accents, gender, age, etcetera.
143143

144144
## Model stability and lifecycle
145145

146-
A base model or custom model deployed to an endpoint using Custom Speech is fixed until you decide to update it. The speech recognition accuracy and quality will remain consistent, even when a new base model is released. This allows you to lock in the behavior of a specific model until you decide to use a newer model.
146+
A base model or custom model deployed to an endpoint using Custom Speech is fixed until you decide to update it. The speech recognition accuracy and quality remain consistent, even when a new base model is released. This allows you to lock in the behavior of a specific model until you decide to use a newer model.
147147

148148
Whether you train your own model or use a snapshot of a base model, you can use the model for a limited time. For more information, see [Model and endpoint lifecycle](./how-to-custom-speech-model-and-endpoint-lifecycle.md).
149149

0 commit comments

Comments
 (0)