Skip to content

Commit d900b6a

Browse files
authored
Merge pull request #210107 from eric-urban/eur/stt-3-1
[Cog Svcs] STT and batch refresh during 3.1 preview
2 parents ef464c2 + 572ba7c commit d900b6a

37 files changed

+1209
-596
lines changed

articles/cognitive-services/.openpublishing.redirection.cognitive-services.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4960,6 +4960,11 @@
49604960
"redirect_url": "/azure/cognitive-services/speech-service/call-center-overview",
49614961
"redirect_document_id": false
49624962
},
4963+
{
4964+
"source_path_from_root": "/articles/cognitive-services/Speech-Service/rest-speech-to-text-v3-1.md ",
4965+
"redirect_url": "/azure/cognitive-services/speech-service/migrate-v3-0-to-v3-1",
4966+
"redirect_document_id": true
4967+
},
49634968
{
49644969
"source_path_from_root": "/articles/cognitive-services/text-analytics/concepts/data-limits.md",
49654970
"redirect_url": "/azure/cognitive-services/language-service/overview",
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
---
2+
title: Locate audio files for batch transcription - Speech service
3+
titleSuffix: Azure Cognitive Services
4+
description: Batch transcription is used to transcribe a large amount of audio in storage. You should provide multiple files per request or point to an Azure Blob Storage container with the audio files to transcribe.
5+
services: cognitive-services
6+
manager: nitinme
7+
author: eric-urban
8+
ms.author: eur
9+
ms.service: cognitive-services
10+
ms.subservice: speech-service
11+
ms.topic: how-to
12+
ms.date: 09/11/2022
13+
ms.devlang: csharp
14+
ms.custom: devx-track-csharp
15+
---
16+
17+
# Locate audio files for batch transcription
18+
19+
Batch transcription is used to transcribe a large amount of audio in storage. Batch transcription can read audio files from a public URI (such as "https://crbn.us/hello.wav") or a [shared access signature (SAS)](../../storage/common/storage-sas-overview.md) URI.
20+
21+
You should provide multiple files per request or point to an Azure Blob Storage container with the audio files to transcribe. The batch transcription service can handle a large number of submitted transcriptions. The service transcribes the files concurrently, which reduces the turnaround time.
22+
23+
## Supported audio formats
24+
25+
The batch transcription API supports the following formats:
26+
27+
| Format | Codec | Bits per sample | Sample rate |
28+
|--------|-------|---------|---------------------------------|
29+
| WAV | PCM | 16-bit | 8 kHz or 16 kHz, mono or stereo |
30+
| MP3 | PCM | 16-bit | 8 kHz or 16 kHz, mono or stereo |
31+
| OGG | OPUS | 16-bit | 8 kHz or 16 kHz, mono or stereo |
32+
33+
For stereo audio streams, the left and right channels are split during the transcription. A JSON result file is created for each input audio file. To create an ordered final transcript, use the timestamps that are generated per utterance.
34+
35+
## Azure Blob Storage example
36+
37+
Batch transcription can read audio files from a public URI (such as "https://crbn.us/hello.wav") or a [shared access signature (SAS)](../../storage/common/storage-sas-overview.md) URI. You can provide individual audio files, or an entire Azure Blob Storage container. You can also read or write transcription results in a container. This example shows how to transcribe audio files in [Azure Blob Storage](../../storage/blobs/storage-blobs-overview.md).
38+
39+
The [SAS URI](../../storage/common/storage-sas-overview.md) must have `r` (read) and `l` (list) permissions. The storage container must have at most 5GB of audio data and a maximum number of 10,000 blobs. The maximum size for a blob is 2.5GB.
40+
41+
Follow these steps to create a storage account, upload wav files from your local directory to a new container, and generate a SAS URL that you can use for batch transcriptions.
42+
43+
1. Set the `RESOURCE_GROUP` environment variable to the name of an existing resource group where the new storage account will be created.
44+
45+
```azurecli-interactive
46+
set RESOURCE_GROUP=<your existing resource group name>
47+
```
48+
49+
1. Set the `AZURE_STORAGE_ACCOUNT` environment variable to the name of a storage account that you want to create.
50+
51+
```azurecli-interactive
52+
set AZURE_STORAGE_ACCOUNT=<choose new storage account name>
53+
```
54+
55+
1. Create a new storage account with the [`az storage account create`](/cli/azure/storage/account#az-storage-account-create) command. Replace `eastus` with the region of your resource group.
56+
57+
```azurecli-interactive
58+
az storage account create -n %AZURE_STORAGE_ACCOUNT% -g %RESOURCE_GROUP% -l eastus
59+
```
60+
61+
> [!TIP]
62+
> When you are finished with batch transcriptions and want to delete your storage account, use the [`az storage delete create`](/cli/azure/storage/account#az-storage-account-delete) command.
63+
64+
1. Get your new storage account keys with the [`az storage account keys list`](/cli/azure/storage/account#az-storage-account-keys-list) command.
65+
66+
```azurecli-interactive
67+
az storage account keys list -g %RESOURCE_GROUP% -n %AZURE_STORAGE_ACCOUNT%
68+
```
69+
70+
1. Set the `AZURE_STORAGE_KEY` environment variable to one of the key values retrieved in the previous step.
71+
72+
```azurecli-interactive
73+
set AZURE_STORAGE_KEY=<your storage account key>
74+
```
75+
76+
> [!IMPORTANT]
77+
> The remaining steps use the `AZURE_STORAGE_ACCOUNT` and `AZURE_STORAGE_KEY` environment variables. If you didn't set the environment variables, you can pass the values as parameters to the commands. See the [az storage container create](/cli/azure/storage/) documentation for more information.
78+
79+
1. Create a container with the [`az storage container create`](/cli/azure/storage/container#az-storage-container-create) command. Replace `<mycontainer>` with a name for your container.
80+
81+
```azurecli-interactive
82+
az storage container create -n <mycontainer>
83+
```
84+
85+
1. The following [`az storage blob upload-batch`](/cli/azure/storage/blob#az-storage-blob-upload-batch) command uploads all .wav files from the current local directory. Replace `<mycontainer>` with a name for your container. Optionally you can modify the command to upload files from a different directory.
86+
87+
```azurecli-interactive
88+
az storage blob upload-batch -d <mycontainer> -s . --pattern *.wav
89+
```
90+
91+
1. Generate a SAS URL with read (r) and list (l) permissions for the container with the [`az storage container generate-sas`](/cli/azure/storage/container#az-storage-container-generate-sas) command. Replace `<mycontainer>` with the name of your container.
92+
93+
```azurecli-interactive
94+
az storage container generate-sas -n <mycontainer> --expiry 2022-09-09 --permissions rl --https-only
95+
```
96+
97+
The previous command returns a SAS token. Append the SAS token to your container blob URL to create a SAS URL. For example: `https://<storage_account_name>.blob.core.windows.net/<container_name>?SAS_TOKEN`.
98+
99+
You will use the SAS URL when you [create a batch transcription](batch-transcription-create.md) request. For example:
100+
101+
```json
102+
{
103+
"contentContainerUrl": "https://<storage_account_name>.blob.core.windows.net/<container_name>?SAS_TOKEN"
104+
}
105+
```
106+
107+
## Next steps
108+
109+
- [Batch transcription overview](batch-transcription.md)
110+
- [Create a batch transcription](batch-transcription-create.md)
111+
- [Get batch transcription results](batch-transcription-get.md)

0 commit comments

Comments
 (0)