Skip to content

Commit 971c954

Browse files
authored
Merge pull request #263750 from MicrosoftDocs/main
Merge main to live, 4 AM
2 parents e3665ef + 4fdac89 commit 971c954

File tree

836 files changed

+2873
-4430
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

836 files changed

+2873
-4430
lines changed

articles/ai-services/speech-service/batch-transcription-create.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,7 @@ Here are some property options that you can use to configure a transcription whe
177177
|`languageIdentification`|Language identification is used to identify languages spoken in audio when compared against a list of [supported languages](language-support.md?tabs=language-identification).<br/><br/>If you set the `languageIdentification` property, then you must also set its enclosed `candidateLocales` property.|
178178
|`languageIdentification.candidateLocales`|The candidate locales for language identification such as `"properties": { "languageIdentification": { "candidateLocales": ["en-US", "de-DE", "es-ES"]}}`. A minimum of 2 and a maximum of 10 candidate locales, including the main locale for the transcription, is supported.|
179179
|`locale`|The locale of the batch transcription. This should match the expected locale of the audio data to transcribe. The locale can't be changed later.<br/><br/>This property is required.|
180-
|`model`|You can set the `model` property to use a specific base model or [Custom Speech](how-to-custom-speech-train-model.md) model. If you don't specify the `model`, the default base model for the locale is used. For more information, see [Using custom models](#using-custom-models) and [Using Whisper models](#using-whisper-models).|
180+
|`model`|You can set the `model` property to use a specific base model or [custom speech](how-to-custom-speech-train-model.md) model. If you don't specify the `model`, the default base model for the locale is used. For more information, see [Using custom models](#using-custom-models) and [Using Whisper models](#using-whisper-models).|
181181
|`profanityFilterMode`|Specifies how to handle profanity in recognition results. Accepted values are `None` to disable profanity filtering, `Masked` to replace profanity with asterisks, `Removed` to remove all profanity from the result, or `Tags` to add profanity tags. The default value is `Masked`. |
182182
|`punctuationMode`|Specifies how to handle punctuation in recognition results. Accepted values are `None` to disable punctuation, `Dictated` to imply explicit (spoken) punctuation, `Automatic` to let the decoder deal with punctuation, or `DictatedAndAutomatic` to use dictated and automatic punctuation. The default value is `DictatedAndAutomatic`.<br/><br/>This property isn't applicable for Whisper models.|
183183
|`timeToLive`|A duration after the transcription job is created, when the transcription results will be automatically deleted. The value is an ISO 8601 encoded duration. For example, specify `PT12H` for 12 hours. As an alternative, you can call [Transcriptions_Delete](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Delete) regularly after you retrieve the transcription results.|
@@ -200,7 +200,7 @@ spx help batch transcription create advanced
200200

201201
Batch transcription uses the default base model for the locale that you specify. You don't need to set any properties to use the default base model.
202202

203-
Optionally, you can modify the previous [create transcription example](#create-a-batch-transcription) by setting the `model` property to use a specific base model or [Custom Speech](how-to-custom-speech-train-model.md) model.
203+
Optionally, you can modify the previous [create transcription example](#create-a-batch-transcription) by setting the `model` property to use a specific base model or [custom speech](how-to-custom-speech-train-model.md) model.
204204

205205
::: zone pivot="rest-api"
206206

@@ -231,12 +231,12 @@ spx batch transcription create --name "My Transcription" --language "en-US" --co
231231

232232
::: zone-end
233233

234-
To use a Custom Speech model for batch transcription, you need the model's URI. You can retrieve the model location when you create or get a model. The top-level `self` property in the response body is the model's URI. For an example, see the JSON response example in the [Create a model](how-to-custom-speech-train-model.md?pivots=rest-api#create-a-model) guide.
234+
To use a custom speech model for batch transcription, you need the model's URI. You can retrieve the model location when you create or get a model. The top-level `self` property in the response body is the model's URI. For an example, see the JSON response example in the [Create a model](how-to-custom-speech-train-model.md?pivots=rest-api#create-a-model) guide.
235235

236236
> [!TIP]
237237
> A [hosted deployment endpoint](how-to-custom-speech-deploy-model.md) isn't required to use custom speech with the batch transcription service. You can conserve resources if the [custom speech model](how-to-custom-speech-train-model.md) is only used for batch transcription.
238238
239-
Batch transcription requests for expired models fail with a 4xx error. You want to set the `model` property to a base model or custom model that hasn't yet expired. Otherwise don't include the `model` property to always use the latest base model. For more information, see [Choose a model](how-to-custom-speech-create-project.md#choose-your-model) and [Custom Speech model lifecycle](how-to-custom-speech-model-and-endpoint-lifecycle.md).
239+
Batch transcription requests for expired models fail with a 4xx error. You want to set the `model` property to a base model or custom model that hasn't yet expired. Otherwise don't include the `model` property to always use the latest base model. For more information, see [Choose a model](how-to-custom-speech-create-project.md#choose-your-model) and [custom speech model lifecycle](how-to-custom-speech-model-and-endpoint-lifecycle.md).
240240

241241
## Using Whisper models
242242

articles/ai-services/speech-service/bring-your-own-storage-speech-resource-speech-to-text.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ This article explains in depth how to use a BYOS-enabled Speech resource in all
2424

2525
## Data storage
2626

27-
When using BYOS, the Speech service doesn't keep any customer artifacts after the data processing (transcription, model training, model testing) is complete. However, some metadata that isn't derived from the user content is stored within Speech service premises. For example, in the Custom speech scenario, the Service keeps certain information about the custom endpoints, like which models they use.
27+
When using BYOS, the Speech service doesn't keep any customer artifacts after the data processing (transcription, model training, model testing) is complete. However, some metadata that isn't derived from the user content is stored within Speech service premises. For example, in the custom speech scenario, the Service keeps certain information about the custom endpoints, like which models they use.
2828

2929
BYOS-associated Storage account stores the following data:
3030

@@ -123,23 +123,23 @@ URL of this format ensures that only Microsoft Entra identities (users, service
123123
124124
## Custom speech
125125

126-
With Custom speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for real-time speech to text, speech translation, and batch transcription. For more information, see the [Custom speech overview](custom-speech-overview.md).
126+
With custom speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for real-time speech to text, speech translation, and batch transcription. For more information, see the [custom speech overview](custom-speech-overview.md).
127127

128-
There's nothing specific about how you use Custom speech with BYOS-enabled Speech resource. The only difference is where all custom model related data, which Speech service collects and produces for you, is stored. The data is stored in the following Blob containers of BYOS-associated Storage account:
128+
There's nothing specific about how you use custom speech with BYOS-enabled Speech resource. The only difference is where all custom model related data, which Speech service collects and produces for you, is stored. The data is stored in the following Blob containers of BYOS-associated Storage account:
129129

130-
- `customspeech-models` - Location of Custom speech models
131-
- `customspeech-artifacts` - Location of all other Custom speech related data
130+
- `customspeech-models` - Location of custom speech models
131+
- `customspeech-artifacts` - Location of all other custom speech related data
132132

133133
The Blob container structure is provided for your information only and subject to change without a notice.
134134

135135
> [!CAUTION]
136-
> Speech service relies on pre-defined Blob container paths and file names for Custom speech module to correctly function. Don't move, rename or in any way alter the contents of `customspeech-models` container and Custom speech related folders of `customspeech-artifacts` container.
136+
> Speech service relies on pre-defined Blob container paths and file names for custom speech module to correctly function. Don't move, rename or in any way alter the contents of `customspeech-models` container and custom speech related folders of `customspeech-artifacts` container.
137137
>
138138
> Failure to do so very likely will result in hard to debug errors and may lead to the necessity of custom model retraining.
139139
>
140-
> Use standard tools, like REST API and Speech Studio to interact with the Custom speech related data. See details in [Custom speech section](custom-speech-overview.md).
140+
> Use standard tools, like REST API and Speech Studio to interact with the custom speech related data. See details in [custom speech section](custom-speech-overview.md).
141141
142-
### Use of REST API with Custom speech
142+
### Use of REST API with custom speech
143143

144144
[Speech to text REST API](rest-speech-to-text.md) fully supports BYOS-enabled Speech resources. However, because the data is now stored within the BYOS-enabled Storage account, requests like [Get Dataset Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Datasets_ListFiles) interact with the BYOS-associated Storage account Blob storage, instead of Speech service internal resources. It allows using the same REST API based code for both "regular" and BYOS-enabled Speech resources.
145145

articles/ai-services/speech-service/bring-your-own-storage-speech-resource.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ BYOS can be used with several Azure AI services. For Speech, it can be used in t
2424

2525
- [Batch transcription](batch-transcription.md)
2626
- Real-time transcription with [audio and transcription result logging](logging-audio-transcription.md) enabled
27-
- [Custom Speech](custom-speech-overview.md) (Custom models for Speech recognition)
27+
- [Custom speech](custom-speech-overview.md) (Custom models for Speech recognition)
2828

2929
**Text to speech**
3030

articles/ai-services/speech-service/call-center-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ The Speech service works well with prebuilt models. However, you might want to f
4545

4646
| Speech customization | Description |
4747
| -------------- | ----------- |
48-
| [Custom Speech](./custom-speech-overview.md) | A speech to text feature used to evaluate and improve the speech recognition accuracy of use-case specific entities (such as alpha-numeric customer, case, and contract IDs, license plates, and names). You can also train a custom model with your own product names and industry terminology. |
48+
| [Custom speech](./custom-speech-overview.md) | A speech to text feature used to evaluate and improve the speech recognition accuracy of use-case specific entities (such as alpha-numeric customer, case, and contract IDs, license plates, and names). You can also train a custom model with your own product names and industry terminology. |
4949
| [Custom neural voice](./custom-neural-voice.md) | A text to speech feature that lets you create a one-of-a-kind, customized, synthetic voice for your applications. |
5050

5151
### Language service

articles/ai-services/speech-service/custom-speech-overview.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,48 @@
11
---
2-
title: Custom Speech overview - Speech service
2+
title: Custom speech overview - Speech service
33
titleSuffix: Azure AI services
4-
description: Custom Speech is a set of online tools that allows you to evaluate and improve the speech to text accuracy for your applications, tools, and products.
4+
description: Custom speech is a set of online tools that allows you to evaluate and improve the speech to text accuracy for your applications, tools, and products.
55
author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: overview
9-
ms.date: 1/18/2024
9+
ms.date: 1/19/2024
1010
ms.author: eur
1111
ms.custom: contperf-fy21q2, references_regions
1212
---
1313

14-
# What is Custom Speech?
14+
# What is custom speech?
1515

16-
With Custom Speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for [real-time speech to text](speech-to-text.md), [speech translation](speech-translation.md), and [batch transcription](batch-transcription.md).
16+
With custom speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for [real-time speech to text](speech-to-text.md), [speech translation](speech-translation.md), and [batch transcription](batch-transcription.md).
1717

1818
Out of the box, speech recognition utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing various common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md?tabs=stt) is used by default. The base model works well in most speech recognition scenarios.
1919

20-
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions.
20+
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions.
21+
22+
You can also train a model with structured text when the data follows a pattern, to specify custom pronunciations, and to customize display text formatting with custom inverse text normalization, custom rewrite, and custom profanity filtering.
2123

2224
## How does it work?
2325

24-
With Custom Speech, you can upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint.
26+
With custom speech, you can upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint.
2527

26-
![Diagram that highlights the components that make up the Custom Speech area of the Speech Studio.](./media/custom-speech/custom-speech-overview.png)
28+
![Diagram that highlights the components that make up the custom speech area of the Speech Studio.](./media/custom-speech/custom-speech-overview.png)
2729

2830
Here's more information about the sequence of steps shown in the previous diagram:
2931

30-
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" title="Create a Speech resource" target="_blank">Speech resource</a> that you create in the Azure portal. If you train a custom model with audio data, choose a Speech resource region with dedicated hardware for training audio data. See footnotes in the [regions](regions.md#speech-service) table for more information.
32+
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" title="Create a Speech resource" target="_blank">Speech resource</a> that you create in the Azure portal. If you train a custom model with audio data, choose a Speech resource region with dedicated hardware for training audio data. For more information, see footnotes in the [regions](regions.md#speech-service) table.
3133
1. [Upload test data](./how-to-custom-speech-upload-data.md). Upload test data to evaluate the speech to text offering for your applications, tools, and products.
3234
1. [Test recognition quality](how-to-custom-speech-inspect-data.md). Use the [Speech Studio](https://aka.ms/speechstudio/customspeech) to play back uploaded audio and inspect the speech recognition quality of your test data.
3335
1. [Test model quantitatively](how-to-custom-speech-evaluate-data.md). Evaluate and improve the accuracy of the speech to text model. The Speech service provides a quantitative word error rate (WER), which you can use to determine if more training is required.
3436
1. [Train a model](how-to-custom-speech-train-model.md). Provide written transcripts and related text, along with the corresponding audio data. Testing a model before and after training is optional but recommended.
3537
> [!NOTE]
36-
> You pay for Custom Speech model usage and [endpoint hosting](how-to-custom-speech-deploy-model.md). You'll also be charged for custom speech model training if the base model was created on October 1, 2023 and later. You are not charged for training if the base model was created prior to October 2023. For more information, see [Azure AI Speech pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/) and the [Charge for adaptation section in the speech to text 3.2 migration guide](./migrate-v3-1-to-v3-2.md#charge-for-adaptation).
37-
1. [Deploy a model](how-to-custom-speech-deploy-model.md). Once you're satisfied with the test results, deploy the model to a custom endpoint. Except for [batch transcription](batch-transcription.md), you must deploy a custom endpoint to use a Custom Speech model.
38+
> You pay for custom speech model usage and [endpoint hosting](how-to-custom-speech-deploy-model.md). You'll also be charged for custom speech model training if the base model was created on October 1, 2023 and later. You are not charged for training if the base model was created prior to October 2023. For more information, see [Azure AI Speech pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/) and the [Charge for adaptation section in the speech to text 3.2 migration guide](./migrate-v3-1-to-v3-2.md#charge-for-adaptation).
39+
1. [Deploy a model](how-to-custom-speech-deploy-model.md). Once you're satisfied with the test results, deploy the model to a custom endpoint. Except for [batch transcription](batch-transcription.md), you must deploy a custom endpoint to use a custom speech model.
3840
> [!TIP]
39-
> A hosted deployment endpoint isn't required to use Custom Speech with the [Batch transcription API](batch-transcription.md). You can conserve resources if the custom speech model is only used for batch transcription. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
41+
> A hosted deployment endpoint isn't required to use custom speech with the [Batch transcription API](batch-transcription.md). You can conserve resources if the custom speech model is only used for batch transcription. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
4042
4143
## Responsible AI
4244

43-
An AI system includes not only the technology, but also the people who use it, the people who will be affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
45+
An AI system includes not only the technology, but also the people who use it, the people who are affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
4446

4547
* [Transparency note and use cases](/legal/cognitive-services/speech-service/speech-to-text/transparency-note?context=/azure/ai-services/speech-service/context/context)
4648
* [Characteristics and limitations](/legal/cognitive-services/speech-service/speech-to-text/characteristics-and-limitations?context=/azure/ai-services/speech-service/context/context)

0 commit comments

Comments
 (0)