batch transcription version 2024-11-15

eric-urban · eric-urban · commit 61b4a0355136 · 2025-05-24T07:25:34.000-07:00
diff --git a/articles/ai-services/speech-service/batch-transcription-audio-data.md b/articles/ai-services/speech-service/batch-transcription-audio-data.md
@@ -7,7 +7,7 @@ author: eric-urban
 ms.author: eur
 ms.service: azure-ai-speech
 ms.topic: how-to
-ms.date: 3/10/2025
+ms.date: 5/25/2025
 ms.devlang: csharp
 ms.custom: devx-track-csharp, devx-track-azurecli
 # Customer intent: As a user who implements audio transcription, I want to learn how to locate audio files for batch transcription.
@@ -27,7 +27,7 @@ You can specify one or multiple audio files when creating a transcription. We re
 
 ## Supported audio formats and codecs
 
-The batch transcription API (and [fast transcription API](./fast-transcription-create.md)) supports multiple formats and codecs, such as:
+The [batch transcription API](./batch-transcription.md) and [fast transcription API](./fast-transcription-create.md) support multiple formats and codecs, such as:
 
 - WAV
 - MP3
@@ -41,11 +41,10 @@ The batch transcription API (and [fast transcription API](./fast-transcription-c
 - WebM
 - SPEEX
 
-
 > [!NOTE]
-> Batch transcription service integrates [GStreamer](./how-to-use-codec-compressed-audio-input-streams.md) and might accept more formats and codecs without returning errors. We suggest to use lossless formats such as WAV (PCM encoding) and FLAC to ensure best transcription quality.
+> Batch transcription service integrates [GStreamer](./how-to-use-codec-compressed-audio-input-streams.md) and might accept more formats and codecs without returning errors. We suggest using lossless formats such as WAV (PCM encoding) and FLAC to ensure best transcription quality.
 
-## Azure Blob Storage upload
+## Upload to Azure Blob Storage
 
 When audio files are located in an [Azure Blob Storage](/azure/storage/blobs/storage-blobs-overview) account, you can request transcription of individual audio files or an entire Azure Blob Storage container. You can also [write transcription results](batch-transcription-create.md#specify-a-destination-container-url) to a Blob container.
 
@@ -89,7 +88,7 @@ Follow these steps to create a storage account and upload wav files from your lo
     ```
 
     > [!TIP]
-    > When you are finished with batch transcriptions and want to delete your storage account, use the [`az storage delete create`](/cli/azure/storage/account#az-storage-account-delete) command.
+    > When you're finished with batch transcriptions and want to delete your storage account, use the [`az storage delete create`](/cli/azure/storage/account#az-storage-account-delete) command.
 
 1. Get your new storage account keys with the [`az storage account keys list`](/cli/azure/storage/account#az-storage-account-keys-list) command. 
 
@@ -125,7 +124,7 @@ Follow these steps to create a storage account and upload wav files from your lo
 This section explains how to set up and limit access to your batch transcription source audio files in an Azure Storage account using the [trusted Azure services security mechanism](/azure/storage/common/storage-network-security#trusted-access-based-on-a-managed-identity). 
 
 > [!NOTE]
-> With the trusted Azure services security mechanism, you need to use [Azure Blob storage](/azure/storage/blobs/storage-blobs-overview) to store audio files. Usage of [Azure Files](/azure/storage/files/storage-files-introduction) is not supported.
+> With the trusted Azure services security mechanism, you need to use [Azure Blob storage](/azure/storage/blobs/storage-blobs-overview) to store audio files. Usage of [Azure Files](/azure/storage/files/storage-files-introduction) isn't supported.
 
 If you perform all actions in this section, your Storage account is configured as follows:
 - Access to all external network traffic is prohibited.
@@ -288,9 +287,9 @@ You could otherwise specify individual files in the container. You must generate
 }
 ```
 
-## Next steps
+## Related content
 
-- [Batch transcription overview](batch-transcription.md)
+- [Learn more about batch transcription](batch-transcription.md)
 - [Create a batch transcription](batch-transcription-create.md)
 - [Get batch transcription results](batch-transcription-get.md)
 - [See batch transcription code samples at GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch/)
diff --git a/articles/ai-services/speech-service/batch-transcription-create.md b/articles/ai-services/speech-service/batch-transcription-create.md
diff --git a/articles/ai-services/speech-service/batch-transcription-get.md b/articles/ai-services/speech-service/batch-transcription-get.md
diff --git a/articles/ai-services/speech-service/batch-transcription.md b/articles/ai-services/speech-service/batch-transcription.md
@@ -7,16 +7,13 @@ author: eric-urban
 ms.author: eur
 ms.service: azure-ai-speech
 ms.topic: overview
-ms.date: 3/10/2025
+ms.date: 5/25/2025
 ms.devlang: csharp
 ms.custom: devx-track-csharp
 ---
 
 # What is batch transcription?
 
-> [!IMPORTANT]
-> New pricing is in effect for batch transcription via [Speech to text REST API v3.2](./migrate-v3-1-to-v3-2.md). For more information, see the [pricing guide](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services). 
-
 Batch transcription is used to transcribe a large amount of audio data in storage. Both the [Speech to text REST API](rest-speech-to-text.md#batch-transcription) and [Speech CLI](spx-basics.md) support batch transcription. 
 
 You should provide multiple files per request or point to an Azure Blob Storage container with the audio files to transcribe. The batch transcription service can handle a large number of submitted transcriptions. The service transcribes the files concurrently, which reduces the turnaround time.
@@ -35,9 +32,9 @@ To use the batch transcription REST API:
 1. [Get batch transcription results](batch-transcription-get.md) - Check transcription status and retrieve transcription results asynchronously. 
 
 > [!IMPORTANT]
-> Batch transcription jobs are scheduled on a best-effort basis. At peak hours it may take up to 30 minutes or longer for a transcription job to start processing. See how to check the current status of a batch transcription job in [this section](batch-transcription-get.md#get-transcription-status).
+> Batch transcription jobs are scheduled on a best-effort basis. At peak hours it might take up to 30 minutes or longer for a transcription job to start processing. See how to check the current status of a batch transcription job in [this section](batch-transcription-get.md#get-transcription-status).
 
-## Next steps
+## Related content
 
 - [Locate audio files for batch transcription](batch-transcription-audio-data.md)
 - [Create a batch transcription](batch-transcription-create.md)
diff --git a/articles/ai-services/speech-service/fast-transcription-create.md b/articles/ai-services/speech-service/fast-transcription-create.md
@@ -7,7 +7,7 @@ author: eric-urban
 ms.author: eur
 ms.service: azure-ai-speech
 ms.topic: how-to
-ms.date: 5/4/2025
+ms.date: 5/25/2025
 # Customer intent: As a user who implements audio transcription, I want create transcriptions as quickly as possible.
 ---
 
@@ -31,10 +31,7 @@ Unlike the batch transcription API, fast transcription API only produces transcr
 > [!TIP]
 > Try out fast transcription in the [Azure AI Foundry portal](https://aka.ms/fasttranscription/studio).
 
-> [!NOTE]
-> Speech service is an elastic service. If you receive 429 error code (too many requests), please follow the [best practices to mitigate throttling during autoscaling](speech-services-quotas-and-limits.md#general-best-practices-to-mitigate-throttling-during-autoscaling).
-
-We learn how to use the fast transcription API (via [Transcriptions - Transcribe](https://go.microsoft.com/fwlink/?linkid=2296107)) with the following scenarios:
+We learn how to use the fast transcription API (via [Transcriptions - Transcribe](/rest/api/speechtotext/transcriptions/transcribe)) with the following scenarios:
 - [Known locale specified](?tabs=locale-specified): Transcribe an audio file with a specified locale. If you know the locale of the audio file, you can specify it to improve transcription accuracy and minimize the latency.
 - [Language identification on](?tabs=language-identification-on): Transcribe an audio file with language identification on. If you're not sure about the locale of the audio file, you can turn on language identification to let the Speech service identify the locale (one locale per audio).
 - [Multi-lingual transcription (preview)](?tabs=multilingual-transcription-on): Transcribe an audio file with the latest multi-lingual speech transcription model. If your audio contains multi-lingual contents that you want to transcribe continuously and accurately, you can use the latest multi-lingual speech transcription model without specifying the locale codes.
@@ -1722,9 +1719,12 @@ The response includes `durationMilliseconds`, `offsetMilliseconds`, and more. Th
 ```
 ---
 
+> [!NOTE]
+> Speech service is an elastic service. If you receive 429 error code (too many requests), please follow the [best practices to mitigate throttling during autoscaling](speech-services-quotas-and-limits.md#general-best-practices-to-mitigate-throttling-during-autoscaling).
+
 ## Request configuration options
 
-Here are some property options to configure a transcription when you call the [Transcriptions - Transcribe](https://go.microsoft.com/fwlink/?linkid=2296107) operation.
+Here are some property options to configure a transcription when you call the [Transcriptions - Transcribe](/rest/api/speechtotext/transcriptions/transcribe) operation.
 
 | Property | Description | Required or optional |
 |----------|-------------|----------------------|
@@ -1735,6 +1735,6 @@ Here are some property options to configure a transcription when you call the [T
 
 ## Related content
 
-- [Fast transcription REST API reference](https://go.microsoft.com/fwlink/?linkid=2296107)
+- [Fast transcription REST API reference](/rest/api/speechtotext/transcriptions/transcribe)
 - [Speech to text supported languages](./language-support.md?tabs=stt)
 - [Batch transcription](./batch-transcription.md)
diff --git a/articles/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle.md b/articles/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle.md
@@ -7,7 +7,7 @@ manager: nitinme
 ms.author: eur
 ms.service: azure-ai-speech
 ms.topic: how-to
-ms.date: 5/19/2025
+ms.date: 5/25/2025
 ms.reviewer: heikora
 zone_pivot_groups: foundry-speech-studio-cli-rest
 #Customer intent: As a developer, I want to understand the lifecycle of custom speech models and endpoints so that I can plan for the expiration of my models.
@@ -43,7 +43,7 @@ When a custom model or base model expires, it's no longer available for transcri
 |Transcription route  |Expired model result  |Recommendation  |
 |---------|---------|---------|
 |Custom endpoint|Speech recognition requests fall back to the most recent base model for the same [locale](language-support.md?tabs=stt). You get results, but recognition might not accurately transcribe your domain data.  |Update the endpoint's model as described in the [Deploy a custom speech model](how-to-custom-speech-deploy-model.md) guide. |
-|Batch transcription |[Batch transcription](batch-transcription.md) requests for expired models fail with a 4xx error. |In each [Transcriptions_Create](/rest/api/speechtotext/transcriptions/create) REST API request body, set the `model` property to a base model or custom model that isn't expired. Otherwise don't include the `model` property to always use the latest base model. |
+|Batch transcription |[Batch transcription](batch-transcription.md) requests for expired models fail with a 4xx error. |In each [Transcriptions - Submit](/rest/api/speechtotext/transcriptions/submit) REST API request body, set the `model` property to a base model or custom model that isn't expired. Otherwise don't include the `model` property to always use the latest base model. |
 
 ## Get base model expiration dates
 
diff --git a/articles/ai-services/speech-service/how-to-get-speech-session-id.md b/articles/ai-services/speech-service/how-to-get-speech-session-id.md
@@ -7,7 +7,7 @@ ms.author: eur
 manager: nitinme
 ms.service: azure-ai-speech
 ms.topic: how-to
-ms.date: 3/10/2025
+ms.date: 5/25/2025
 ms.reviewer: alexeyo
 #Customer intent: As a developer, I need to know how to get the session ID and transcription ID for speech to text so that I can debug issues with my application.
 ---
@@ -91,11 +91,11 @@ https://eastus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiv
 
 ## Getting Transcription ID for Batch transcription
 
-[Batch transcription API](batch-transcription.md) is a subset of the [Speech to text REST API](rest-speech-to-text.md). 
+[Batch transcription API](batch-transcription.md) is part of the [Speech to text REST API](rest-speech-to-text.md). 
 
-The required Transcription ID is the GUID value contained in the main `self` element of the Response body returned by requests, like [Transcriptions_Create](/rest/api/speechtotext/transcriptions/create).
+The required Transcription ID is the GUID value contained in the main `self` element of the Response body returned by requests, like [Transcriptions - Submit](/rest/api/speechtotext/transcriptions/submit).
 
-The following is and example response body of a [Transcriptions_Create](/rest/api/speechtotext/transcriptions/create) request. GUID value `537216f8-0620-4a10-ae2d-00bdb423b36f` found in the first `self` element is the Transcription ID.
+The following is and example response body of a [Transcriptions - Submit](/rest/api/speechtotext/transcriptions/submit) request. GUID value `537216f8-0620-4a10-ae2d-00bdb423b36f` found in the first `self` element is the Transcription ID.
 
 ```json
 {
@@ -127,4 +127,4 @@ The following is and example response body of a [Transcriptions_Create](/rest/ap
 > Use the same technique to determine different IDs required for debugging issues related to [custom speech](custom-speech-overview.md), like uploading a dataset using [Datasets_Create](/rest/api/speechtotext/datasets/create) request.
 
 > [!NOTE]
-> You can also see all existing transcriptions and their Transcription IDs for a given Speech resource by using [Transcriptions_Get](/rest/api/speechtotext/transcriptions/get) request.
+> You can also see all existing transcriptions and their Transcription IDs for a given Speech resource by using [Transcriptions - Get](/rest/api/speechtotext/transcriptions/get) request.
diff --git a/articles/ai-services/speech-service/language-identification.md b/articles/ai-services/speech-service/language-identification.md
@@ -7,7 +7,7 @@ manager: nitinme
 ms.service: azure-ai-speech
 ms.custom: devx-track-extended-java, devx-track-js, devx-track-python
 ms.topic: how-to
-ms.date: 3/10/2025
+ms.date: 5/25/2025
 ms.author: eur
 zone_pivot_groups: programming-languages-speech-services-nomore-variant
 #customer intent: As an application developer, I want to use language recognition or translations in order to make my apps work seamlessly for more customers.
@@ -1075,7 +1075,7 @@ For more information about containers, see the [language identification speech c
 
 ## Implement speech to text batch transcription
 
-To identify languages with [Batch transcription REST API](batch-transcription.md), use `languageIdentification` property in the body of your [Transcriptions_Create](/rest/api/speechtotext/transcriptions/create) request.
+To identify languages with [Batch transcription REST API](batch-transcription.md), use `languageIdentification` property in the body of your [Transcriptions - Submit](/rest/api/speechtotext/transcriptions/submit) request.
 
 > [!WARNING]
 > Batch transcription only supports language identification for default base models. If both language identification and a custom model are specified in the transcription request, the service falls back to use the base models for the specified candidate languages. This might result in unexpected recognition results.
diff --git a/articles/ai-services/speech-service/rest-speech-to-text-short.md b/articles/ai-services/speech-service/rest-speech-to-text-short.md
@@ -6,7 +6,7 @@ author: eric-urban
 manager: nitinme
 ms.service: azure-ai-speech
 ms.topic: how-to
-ms.date: 3/10/2025
+ms.date: 5/25/2025
 ms.author: eur
 ms.devlang: csharp
 ms.custom: devx-track-csharp
@@ -15,13 +15,13 @@ ms.custom: devx-track-csharp
 
 # Speech to text REST API for short audio
 
-Use cases for the Speech to text REST API for short audio are limited. Use it only in cases where you can't use the [Speech SDK](speech-sdk.md).
+Use cases for the Speech to text REST API for short audio are limited. Use it only in cases where you can't use the [Speech SDK](speech-sdk.md) or [fast transcription API](fast-transcription-create.md). 
 
 Before you use the Speech to text REST API for short audio, consider the following limitations:
 
 * Requests that use the REST API for short audio and transmit audio directly can contain no more than 60 seconds of audio. For pronunciation assessment, the audio duration should be no more than 30 seconds. The input [audio formats](#audio-formats) are more limited compared to the [Speech SDK](speech-sdk.md).
 * The REST API for short audio returns only final results. It doesn't provide partial results.
-* [Speech translation](speech-translation.md) isn't supported via REST API for short audio. You need to use [Speech SDK](speech-sdk.md).
+* [Speech translation](speech-translation.md) isn't supported via REST API for short audio. You need to use the [Speech SDK](speech-sdk.md).
 * [Batch transcription](batch-transcription.md) and [custom speech](custom-speech-overview.md) aren't supported via REST API for short audio. You should always use the [Speech to text REST API](rest-speech-to-text.md) for batch transcription and custom speech.
 
 Before you use the Speech to text REST API for short audio, understand that you need to complete a token exchange as part of authentication to access the service. For more information, see [Authentication](#authentication).
@@ -49,7 +49,7 @@ Audio is sent in the body of the HTTP `POST` request. It must be in one of the f
 | OGG    | OPUS  | 256 kbps | 16 kHz, mono |
 
 > [!NOTE]
-> The preceding formats are supported through the REST API for short audio and WebSocket in the Speech service. The [Speech SDK](speech-sdk.md) supports the WAV format with PCM codec as well as [other formats](how-to-use-codec-compressed-audio-input-streams.md).
+> The preceding formats are supported through the REST API for short audio and WebSockets in the Speech service. The [Speech SDK](speech-sdk.md) supports the WAV format with PCM codec as well as [other formats](how-to-use-codec-compressed-audio-input-streams.md).
 
 ## Request headers
 
@@ -77,7 +77,6 @@ These parameters might be included in the query string of the REST request.
 | `language` | Identifies the spoken language that's being recognized. See [Supported languages](language-support.md?tabs=stt). | Required |
 | `format` | Specifies the result format. Accepted values are `simple` and `detailed`. Simple results include `RecognitionStatus`, `DisplayText`, `Offset`, and `Duration`. Detailed responses include four different representations of display text. The default setting is `simple`. | Optional |
 | `profanity` | Specifies how to handle profanity in recognition results. Accepted values are: <br><br>`masked`, which replaces profanity with asterisks. <br>`removed`, which removes all profanity from the result. <br>`raw`, which includes profanity in the result. <br><br>The default setting is `masked`. | Optional |
-| `cid` | When you're using the [Speech Studio](speech-studio-overview.md) to create [custom models](./custom-speech-overview.md), you can take advantage of the **Endpoint ID** value from the **Deployment** page. Use the **Endpoint ID** value as the argument to the `cid` query string parameter. | Optional |
 
 ### Pronunciation assessment parameters
 
@@ -360,7 +359,8 @@ using (var fs = new FileStream(audioFile, FileMode.Open, FileAccess.Read))
 
 [!INCLUDE [](includes/cognitive-services-speech-service-rest-auth.md)]
 
-## Next steps
+## Related content
 
+- [Fast transcription API](fast-transcription-create.md)
 - [Customize speech models](./how-to-custom-speech-train-model.md)
 - [Get familiar with batch transcription](batch-transcription.md)
diff --git a/articles/ai-services/speech-service/rest-speech-to-text.md b/articles/ai-services/speech-service/rest-speech-to-text.md