Skip to content

Commit 66aba2a

Browse files
Merge pull request #242333 from alexeyo26/alexeyo/byos-1
[Azure AI Svcs] Speech. BYOS article(s)
2 parents fdca4f8 + a822f5f commit 66aba2a

File tree

3 files changed

+561
-0
lines changed

3 files changed

+561
-0
lines changed
Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
title: Use Bring your own storage (BYOS) Speech resource for Speech to text
3+
titleSuffix: Azure AI services
4+
description: Learn how to use Bring your own storage (BYOS) Speech resource with Speech to text.
5+
services: cognitive-services
6+
author: alexeyo26
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: speech-service
10+
ms.topic: how-to
11+
ms.date: 03/28/2023
12+
ms.author: alexeyo
13+
---
14+
15+
# Use the Bring your own storage (BYOS) Speech resource for Speech to text
16+
17+
Bring your own storage (BYOS) can be used in the following Speech to text scenarios:
18+
19+
- Batch transcription
20+
- Real-time transcription with audio and transcription result logging enabled
21+
- Custom Speech
22+
23+
One Speech resource to Storage account pairing can be used for all scenarios simultaneously.
24+
25+
This article explains in depth how to use a BYOS-enabled Speech resource in all Speech to text scenarios. The article implies, that you have [a fully configured BYOS-enabled Speech resource and associated Storage account](bring-your-own-storage-speech-resource.md).
26+
27+
## Data storage
28+
29+
When using BYOS, the Speech service doesn't keep any customer artifacts after the data processing (transcription, model training, model testing) is complete. However, some metadata that isn't derived from the user content is stored within Speech service premises. For example, in Custom Speech scenario, the Service keeps certain information about the custom endpoints, like which models they use.
30+
31+
BYOS-associated Storage account stores the following data:
32+
33+
> [!NOTE]
34+
> *Optional* in this section means that it's possible, but not required to store the particular artifacts in the BYOS-associated Storage account. If needed, they can be stored elsewhere.
35+
36+
**Batch transcription**
37+
- Source audio (optional)
38+
- Batch transcription results
39+
40+
**Real-time transcription with audio and transcription result logging enabled**
41+
- Audio and transcription result logs
42+
43+
**Custom Speech**
44+
- Source files of datasets for model training and testing (optional)
45+
- All data and metadata related to Custom models hosted by the BYOS-enabled Speech resource
46+
47+
## Batch transcription
48+
49+
Batch transcription is used to transcribe a large amount of audio data in storage. If you're unfamiliar with Batch transcription, see [this article](batch-transcription.md) first.
50+
51+
Perform these steps to execute Batch transcription with BYOS-enabled Speech resource:
52+
53+
1. Start Batch transcription as described in [this guide](batch-transcription-create.md).
54+
55+
> [!IMPORTANT]
56+
> Don't use `destinationContainerUrl` parameter in your transcription request. If you use BYOS, the transcription results are stored in the BYOS-associated Storage account automatically.
57+
>
58+
> If you use `destinationContainerUrl` parameter, it will work, but provide significantly less security for your data, because of ad hoc SAS usage. See details [here](batch-transcription-create.md#destination-container-url).
59+
60+
1. When transcription is complete, get transcription results according to [this guide](batch-transcription-get.md) or directly in the `TranscriptionData` folder of `customspeech-artifacts` Blob container in the BYOS-associated Storage account.
61+
62+
### Get Batch transcription results via REST API
63+
64+
[Speech to text REST API](rest-speech-to-text.md) fully supports BYOS-enabled Speech resources. However, because the data is now stored within the BYOS-enabled Storage account, requests like [Get Transcription Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_ListFiles) interact with the BYOS-associated Storage account Blob storage, instead of Speech service internal resources. It allows using the same REST API based code for both "regular" and BYOS-enabled Speech resources.
65+
66+
For maximum security use the `sasValidityInSeconds` parameter with the value set to `0` in the requests, that return data file URLs, like [Get Transcription Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_ListFiles) request. Here's an example request URL:
67+
68+
```https
69+
https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions/3b24ca19-2eb1-4a2a-b964-35d89eca486b/files?sasValidityInSeconds=0
70+
```
71+
72+
Such a request returns direct Storage Account URLs to data files (without SAS or other additions). For example:
73+
74+
```json
75+
"links": {
76+
"contentUrl": "https://<BYOS_storage_account_name>.blob.core.windows.net/customspeech-artifacts/TranscriptionData/3b24ca19-2eb1-4a2a-b964-35d89eca486b_0_0.json"
77+
}
78+
```
79+
80+
URL of this format ensures that only Azure Active Directory identities (users, service principals, managed identities) with sufficient access rights (like *Storage Blob Data Reader* role) can access the data from the URL.
81+
82+
> [!WARNING]
83+
> If `sasValidityInSeconds` parameter is omitted in [Get Transcription Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Transcriptions_ListFiles) request or similar ones, then a [User delegation SAS](../../storage/common/storage-sas-overview.md) with the validity of 30 days will be generated for each data file URL returned. This SAS is signed by the system assigned managed identity of your BYOS-enabled Speech resource. Because of it, the SAS allows access to the data, even if storage account key access is disabled. See details [here](../../storage/common/shared-key-authorization-prevent.md#understand-how-disallowing-shared-key-affects-sas-tokens).
84+
85+
## Real-time transcription with audio and transcription result logging enabled
86+
87+
You can enable logging for both audio input and recognized speech when using speech to text or speech translation. See the complete description [in this article](logging-audio-transcription.md).
88+
89+
If you use BYOS, then you find the logs in `customspeech-audiologs` Blob container in the BYOS-associated Storage account.
90+
91+
> [!WARNING]
92+
> Logging data is kept for 30 days. After this period the logs are automatically deleted. This is valid for BYOS-enabled Speech resources as well. If you want to keep the logs longer, copy the correspondent files and folders from `customspeech-audiologs` Blob container directly or use REST API.
93+
94+
### Get real-time transcription logs via REST API
95+
96+
[Speech to text REST API](rest-speech-to-text.md) fully supports BYOS-enabled Speech resources. However, because the data is now stored within the BYOS-enabled Storage account, requests like [Get Base Model Logs](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Endpoints_ListBaseModelLogs) interact with the BYOS-associated Storage account Blob storage, instead of Speech service internal resources. It allows using the same REST API based code for both "regular" and BYOS-enabled Speech resources.
97+
98+
For maximum security use the `sasValidityInSeconds` parameter with the value set to `0` in the requests, that return data file URLs, like [Get Base Model Logs](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Endpoints_ListBaseModelLogs) request. Here's an example request URL:
99+
100+
```https
101+
https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/endpoints/base/en-US/files/logs?sasValidityInSeconds=0
102+
```
103+
104+
Such a request returns direct Storage Account URLs to data files (without SAS or other additions). For example:
105+
106+
```json
107+
"links": {
108+
"contentUrl": "https://<BYOS_storage_account_name>.blob.core.windows.net/customspeech-audiologs/be172190e1334399852185c0addee9d6/en-US/2023-07-06/152339_fcf52189-0d3f-4415-becd-5f639fd7fd6b.v2.json"
109+
}
110+
```
111+
112+
URL of this format ensures that only Azure Active Directory identities (users, service principals, managed identities) with sufficient access rights (like *Storage Blob Data Reader* role) can access the data from the URL.
113+
114+
> [!WARNING]
115+
> If `sasValidityInSeconds` parameter is omitted in [Get Base Model Logs](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Endpoints_ListBaseModelLogs) request or similar ones, then a [User delegation SAS](../../storage/common/storage-sas-overview.md) with the validity of 30 days will be generated for each data file URL returned. This SAS is signed by the system assigned managed identity of your BYOS-enabled Speech resource. Because of it, the SAS allows access to the data, even if storage account key access is disabled. See details [here](../../storage/common/shared-key-authorization-prevent.md#understand-how-disallowing-shared-key-affects-sas-tokens).
116+
117+
## Custom Speech
118+
119+
With Custom Speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for real-time speech to text, speech translation, and batch transcription. For more information, see the [Custom Speech overview](custom-speech-overview.md).
120+
121+
There's nothing specific about how you use Custom Speech with BYOS-enabled Speech resource. The only difference is where all custom model related data, which Speech service collects and produces for you, is stored. The data is stored in the following Blob containers of BYOS-associated Storage account:
122+
123+
- `customspeech-models` - Location of Custom Speech models
124+
- `customspeech-artifacts` - Location of all other Custom Speech related data
125+
- Custom Speech data is located in all subfolders of the container, except for `TranscriptionData`. This subfolder contains Batch transcription results.
126+
127+
> [!CAUTION]
128+
> Speech service relies on pre-defined Blob container paths and file names for Custom Speech module to correctly function. Don't move, rename or in any way alter the contents of `customspeech-models` container and Custom Speech related folders of `customspeech-artifacts` container.
129+
>
130+
> Failure to do so very likely will result in hard to debug errors and may lead to the necessity of custom model retraining.
131+
>
132+
> Use standard tools, like REST API and Speech Studio to interact with the Custom Speech related data. See detail in [Custom Speech section](custom-speech-overview.md).
133+
134+
### Use of REST API with Custom Speech
135+
136+
[Speech to text REST API](rest-speech-to-text.md) fully supports BYOS-enabled Speech resources. However, because the data is now stored within the BYOS-enabled Storage account, requests like [Get Dataset Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Datasets_ListFiles) interact with the BYOS-associated Storage account Blob storage, instead of Speech service internal resources. It allows using the same REST API based code for both "regular" and BYOS-enabled Speech resources.
137+
138+
For maximum security use the `sasValidityInSeconds` parameter with the value set to `0` in the requests, that return data file URLs, like [Get Dataset Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Datasets_ListFiles) request. Here's an example request URL:
139+
140+
```https
141+
https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/datasets/8427b92a-cb50-4cda-bf04-964ea1b1781b/files?sasValidityInSeconds=0
142+
```
143+
144+
Such a request returns direct Storage Account URLs to data files (without SAS or other additions). For example:
145+
146+
```json
147+
"links": {
148+
"contentUrl": "https://<BYOS_storage_account_name>.blob.core.windows.net/customspeech-artifacts/AcousticData/8427b92a-cb50-4cda-bf04-964ea1b1781b/4a61ddac-5b1c-4c21-b87d-22001b0f18ab.zip"
149+
}
150+
```
151+
152+
URL of this format ensures that only Azure Active Directory identities (users, service principals, managed identities) with sufficient access rights (like *Storage Blob Data Reader* role) can access the data from the URL.
153+
154+
> [!WARNING]
155+
> If `sasValidityInSeconds` parameter is omitted in [Get Dataset Files](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1/operations/Datasets_ListFiles) request or similar ones, then a [User delegation SAS](../../storage/common/storage-sas-overview.md) with the validity of 30 days will be generated for each data file URL returned. This SAS is signed by the system assigned managed identity of your BYOS-enabled Speech resource. Because of it, the SAS allows access to the data, even if storage account key access is disabled. See details [here](../../storage/common/shared-key-authorization-prevent.md#understand-how-disallowing-shared-key-affects-sas-tokens).
156+
157+
## Next steps
158+
159+
- [Set up the Bring your own storage (BYOS) Speech resource](bring-your-own-storage-speech-resource.md)
160+
- [Batch transcription overview](batch-transcription.md)
161+
- [How to log audio and transcriptions for speech recognition](logging-audio-transcription.md)
162+
- [Custom Speech overview](custom-speech-overview.md)

0 commit comments

Comments
 (0)