You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To use a Custom Speech model for batch transcription, you need the model's URI. You can retrieve the model location when you create or get a model. The top-level `self` property in the response body is the model's URI. For an example, see the JSON response example in the [Create a model](how-to-custom-speech-train-model.md?pivots=rest-api#create-a-model) guide. A [custom model deployment endpoint](how-to-custom-speech-deploy-model.md) isn't needed for the batch transcription service.
231
+
To use a Custom Speech model for batch transcription, you need the model's URI. You can retrieve the model location when you create or get a model. The top-level `self` property in the response body is the model's URI. For an example, see the JSON response example in the [Create a model](how-to-custom-speech-train-model.md?pivots=rest-api#create-a-model) guide.
232
+
233
+
> [!TIP]
234
+
> A [hosted deployment endpoint](how-to-custom-speech-deploy-model.md) isn't required to use custom speech with the batch transcription service. You can conserve resources if the [custom speech model](how-to-custom-speech-train-model.md) is only used for batch transcription.
232
235
233
236
Batch transcription requests for expired models will fail with a 4xx error. You'll want to set the `model` property to a base model or custom model that hasn't yet expired. Otherwise don't include the `model` property to always use the latest base model. For more information, see [Choose a model](how-to-custom-speech-create-project.md#choose-your-model) and [Custom Speech model lifecycle](how-to-custom-speech-model-and-endpoint-lifecycle.md).
With Custom Speech, you can evaluate and improve the Microsoft speech-to-text accuracy for your applications and products.
18
+
With Custom Speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for [real-time speech-to-text](speech-to-text), [speech translation](speech-translation), and [batch transcription](batch-transcription.md).
19
19
20
-
Out of the box, speech to text utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing a variety of common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md?tabs=stt) is used by default. The base model works very well in most speech recognition scenarios.
20
+
Out of the box, speech recognition utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing a variety of common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md?tabs=stt) is used by default. The base model works very well in most speech recognition scenarios.
21
21
22
22
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions.
23
23
24
-
> [!NOTE]
25
-
> You pay to use Custom Speech models, but you are not charged for training a model. Usage includes hosting of your deployed custom endpoint in addition to using the endpoint for speech-to-text. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
26
-
27
24
## How does it work?
28
25
29
26
With Custom Speech, you can upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint.
@@ -37,7 +34,11 @@ Here's more information about the sequence of steps shown in the previous diagra
37
34
1.[Test recognition quality](how-to-custom-speech-inspect-data.md). Use the [Speech Studio](https://aka.ms/speechstudio/customspeech) to play back uploaded audio and inspect the speech recognition quality of your test data.
38
35
1.[Test model quantitatively](how-to-custom-speech-evaluate-data.md). Evaluate and improve the accuracy of the speech-to-text model. The Speech service provides a quantitative word error rate (WER), which you can use to determine if additional training is required.
39
36
1.[Train a model](how-to-custom-speech-train-model.md). Provide written transcripts and related text, along with the corresponding audio data. Testing a model before and after training is optional but recommended.
37
+
> [!NOTE]
38
+
> You pay for Custom Speech model usage and endpoint hosting, but you are not charged for training a model.
40
39
1.[Deploy a model](how-to-custom-speech-deploy-model.md). Once you're satisfied with the test results, deploy the model to a custom endpoint. With the exception of [batch transcription](batch-transcription.md), you must deploy a custom endpoint to use a Custom Speech model.
40
+
> [!TIP]
41
+
> A hosted deployment endpoint isn't required to use Custom Speech with the [Batch transcription API](batch-transcription.md). You can conserve resources if the custom speech model is only used for batch transcription. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
In this article, you'll learn how to deploy an endpoint for a Custom Speech model. With the exception of [batch transcription](batch-transcription.md), you must deploy a custom endpoint to use a Custom Speech model.
18
+
In this article, you'll learn how to deploy an endpoint for a Custom Speech model. With the exception of [batch transcription](batch-transcription.md), you must deploy a custom endpoint to use a Custom Speech model.
19
19
20
-
> [!NOTE]
21
-
> You pay to use Custom Speech models, but you are not charged for training a model. Usage includes hosting of your deployed custom endpoint in addition to using the endpoint for speech-to-text. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
20
+
> [!TIP]
21
+
> A hosted deployment endpoint isn't required to use Custom Speech with the [Batch transcription API](batch-transcription.md). You can conserve resources if the [custom speech model](how-to-custom-speech-train-model.md) is only used for batch transcription. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
22
22
23
23
You can deploy an endpoint for a base or custom model, and then [update](#change-model-and-redeploy-endpoint) the endpoint later to use a better trained model.
In this article, you'll learn how to train a custom model to improve recognition accuracy from the Microsoft base model. The speech recognition accuracy and quality of a Custom Speech model will remain consistent, even when a new base model is released.
20
20
21
21
> [!NOTE]
22
-
> You pay to use Custom Speech models, but you are not charged for training a model. Usage includes hosting of your deployed custom endpoint in addition to using the endpoint for speech-to-text. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
22
+
> You pay for Custom Speech model usage and [endpoint hosting](how-to-custom-speech-deploy-model.md), but you are not charged for training a model.
23
23
24
24
Training a model is typically an iterative process. You will first select a base model that is the starting point for a new model. You train a model with [datasets](./how-to-custom-speech-test-and-train.md) that can include text and audio, and then you test. If the recognition quality or accuracy doesn't meet your requirements, you can create a new model with additional or modified training data, and then test again.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/how-to-get-speech-session-id.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,12 +17,12 @@ ms.author: alexeyo
17
17
If you use [Speech-to-text](speech-to-text.md) and need to open a support case, you are often asked to provide a *Session ID* or *Transcription ID* of the problematic transcriptions to debug the issue. This article explains how to get these IDs.
18
18
19
19
> [!NOTE]
20
-
> **Session ID* is used in [Online transcription](get-started-speech-to-text.md) and [Translation](speech-translation.md).
20
+
> **Session ID* is used in [real-time speech to text](get-started-speech-to-text.md) and [speech translation](speech-translation.md).
21
21
> **Transcription ID* is used in [Batch transcription](batch-transcription.md).
22
22
23
-
## Getting Session ID for Online transcription and Translation. (Speech SDK and REST API for short audio).
23
+
## Getting Session ID
24
24
25
-
[Online transcription](get-started-speech-to-text.md) and [Translation](speech-translation.md) use either the [Speech SDK](speech-sdk.md) or the [REST API for short audio](rest-speech-to-text-short.md).
25
+
[Real-time speech to text](get-started-speech-to-text.md) and [speech translation](speech-translation.md) use either the [Speech SDK](speech-sdk.md) or the [REST API for short audio](rest-speech-to-text-short.md).
26
26
27
27
To get the Session ID, when using SDK you need to:
[Batch transcription API](batch-transcription.md)is a subset of the[Speech-to-text REST API](rest-speech-to-text.md).
84
84
85
-
The required Transcription ID is the GUID value contained in the main `self` element of the Response body returned by requests, like [Transcriptions_Create](https://centralus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Create).
85
+
The required Transcription ID is the GUID value contained in the main `self` element of the Response body returned by requests, like [Transcriptions_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Create).
86
86
87
-
The example below is the Response body of a `Create Transcription` request. GUID value `537216f8-0620-4a10-ae2d-00bdb423b36f` found in the first `self` element is the Transcription ID.
87
+
The following is and example response body of a [Transcriptions_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Create) request. GUID value `537216f8-0620-4a10-ae2d-00bdb423b36f` found in the first `self` element is the Transcription ID.
@@ -113,7 +113,7 @@ The example below is the Response body of a `Create Transcription` request. GUID
113
113
}
114
114
```
115
115
> [!NOTE]
116
-
> Use the same technique to determine different IDs required for debugging issues related to [Custom Speech](custom-speech-overview.md), like uploading a dataset using [Datasets_Create](https://centralus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Datasets_Create) request.
116
+
> Use the same technique to determine different IDs required for debugging issues related to [Custom Speech](custom-speech-overview.md), like uploading a dataset using [Datasets_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Datasets_Create) request.
117
117
118
118
> [!NOTE]
119
-
> You can also see all existing transcriptions and their Transcription IDs for a given Speech resource by using [Transcriptions_Get](https://centralus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Get) request.
119
+
> You can also see all existing transcriptions and their Transcription IDs for a given Speech resource by using [Transcriptions_Get](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Get) request.
> Language detection with custom models can be used in OnLine transcription only. Batch transcription supports language detection for base models.
500
+
> Language detection with custom models can only be used with real-time speech to text and speech translation. Batch transcription only supports language detection for base models.
501
501
502
502
::: zone pivot="programming-language-csharp"
503
503
This sample shows how to use language detection with a custom endpoint. If the detected language is `en-US`, then the default model is used. If the detected language is `fr-FR`, then the custom model endpoint is used. For more information, see [Deploy a Custom Speech model](how-to-custom-speech-deploy-model.md).
@@ -592,9 +592,9 @@ var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fr
592
592
To identify languages in [Batch transcription](batch-transcription.md), you need to use `languageIdentification` property in the body of your [transcription REST request](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_Create). The example in this section shows the usage of `languageIdentification` property with four candidate languages.
593
593
594
594
> [!WARNING]
595
-
> Batch transcription supports language identification for base models only. If both language identification and custom model usage are specified in the transcription request, the service will automatically fall back to the base models for the specified candidate languages. This may result in unexpected recognition results.
595
+
> Batch transcription only supports language identification for base models. If both language identification and a custom model are specified in the transcription request, the service will fall back to use the base models for the specified candidate languages. This may result in unexpected recognition results.
596
596
>
597
-
> If your scenario requires both language identification and custom models, use [OnLine transcription](#using-speech-to-text-custom-models).
597
+
> If your speech to text scenario requires both language identification and custom models, use [real-time speech to text](#using-speech-to-text-custom-models) instead of batch transcription.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/speech-services-private-link.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -139,7 +139,7 @@ Speech-to-text has two REST APIs. Each API serves a different purpose, uses diff
139
139
140
140
The Speech-to-text REST APIs are:
141
141
-[Speech-to-text REST API](rest-speech-to-text.md), which is used for [Batch transcription](batch-transcription.md) and [Custom Speech](custom-speech-overview.md).
142
-
-[Speech-to-text REST API for short audio](rest-speech-to-text-short.md), which is used for online transcription
142
+
-[Speech-to-text REST API for short audio](rest-speech-to-text-short.md), which is used for real-time speech to text.
143
143
144
144
Usage of the Speech-to-text REST API for short audio and the Text-to-speech REST API in the private endpoint scenario is the same. It's equivalent to the [Speech SDK case](#speech-resource-with-a-custom-domain-name-and-a-private-endpoint-usage-with-the-speech-sdk) described later in this article.
0 commit comments