You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/batch-transcription.md
+24-25Lines changed: 24 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Azure Batch Transcription API
2
+
title: Azure Batch transcription API
3
3
description: Samples
4
4
services: cognitive-services
5
5
author: PanosPeriorellis
@@ -13,18 +13,18 @@ ms.author: panosper
13
13
14
14
# Batch transcription
15
15
16
-
Batch transcription is ideal for use cases with large amounts of audio. It enables the developer to point to audio files and get back transcriptions in asynchronous mode.
16
+
Batch transcription is ideal if you have large amounts of audio. You can point to audio files and get back transcriptions in asynchronous mode.
17
17
18
18
## Batch transcription API
19
19
20
-
The Batch transcription API makes the above scenario possible. It offers asynchronous speech to text transcription along with additional features.
20
+
The Batch transcription API offers asynchronous speech to text transcription, along with additional features.
21
21
22
22
> [!NOTE]
23
-
> The Batch transcription API is ideal for Call Centers which typically accumulate thousands of hours of audio. The Fire & Forget philosophy of the API makes it easy to transcribe large volume of audio recordings.
23
+
> The Batch transcription API is ideal for call centers, which typically accumulate thousands of hours of audio. The API is guided by a "fire and forget" philosophy, which makes it easy to transcribe large volume of audio recordings.
24
24
25
25
### Supported formats
26
26
27
-
The Batch transcription API aims to become the de-facto for all offline call center-related scenarios and offer support for all related formats. Currently supported formats:
27
+
The Batch transcription API supports the following formats:
28
28
29
29
Name| Channel |
30
30
----|----------|
@@ -33,7 +33,7 @@ mp3 | Stereo |
33
33
wav | Mono |
34
34
wav | Stereo |
35
35
36
-
For stereo audio streams, Batch transcription will split the left and right channel during the transcription. The two JSON files with the result are each created from a single channel. The timestamps per utterance enable the developer to create an ordered final transcript. The following JSON sample shows the output of a channel.
36
+
For stereo audio streams, Batch transcription splits the left and right channel during the transcription. The two JSON files with the result are each created from a single channel. The timestamps per utterance enable the developer to create an ordered final transcript. The following JSON sample shows the output of a channel.
37
37
38
38
```json
39
39
{
@@ -51,28 +51,28 @@ For stereo audio streams, Batch transcription will split the left and right chan
51
51
```
52
52
53
53
> [!NOTE]
54
-
> The Batch transcription API is using a REST service for requesting transcriptions, their status, and associated results. The API can be used from any language. The next section describes how it is used.
54
+
> The Batch transcription API is using a REST service for requesting transcriptions, their status, and associated results. You can use the API from any language. The next section describes how it is used.
55
55
56
56
## Authorization token
57
57
58
-
As with all features of the Unified Speech Service, the user needs to create a subscription key from the [Azure portal](https://portal.azure.com). In addition, an API key needs to be acquired from the Speech Portal. The steps to generate an API key:
58
+
As with all features of the Unified Speech Service, you create a subscription key from the [Azure portal](https://portal.azure.com). In addition, you acquire an API key from the Speech portal:
59
59
60
-
1.Log in to https://customspeech.ai.
60
+
1.Sign in to [Custom Speech](https://customspeech.ai).
61
61
62
-
2.Click on Subscriptions.
62
+
2.Select **Subscriptions**.
63
63
64
-
3.Click on the option `Generate API Key`.
64
+
3.Select **Generate API Key**.
65
65
66
-

66
+

67
67
68
-
4. Copy and paste that key in the client code in the sample below.
68
+
4. Copy and paste that key in the client code in the following sample.
69
69
70
70
> [!NOTE]
71
-
> If you plan to use a custom model then you will need the ID of that model too. Note that this is not the Deployment or Endpoint ID that you find on the Endpoint Details view but the model ID which you can retrieve when you click on the Details of that model
71
+
> If you plan to use a custom model, you will need the ID of that model too. Note that this is not the deployment or endpoint ID that you find on the Endpoint Details view. It is the model ID that you can retrieve when you select the details of that model.
72
72
73
73
## Sample code
74
74
75
-
Making use of the API is fairly straight forward. The sample code below needs to be customized with a subscription key and an API key, which in turn allows the developer to obtain a bearer token, as the following code snippet shows:
75
+
Customize the following sample code with a subscription key and an API key. This allows you to obtain a bearer token.
@@ -89,7 +89,7 @@ Making use of the API is fairly straight forward. The sample code below needs to
89
89
}
90
90
```
91
91
92
-
Once the token is obtained the developer must specifiy the SAS Uri pointing to the audio file requiring transcription. The rest of the code simply iterates through the status and displays results.
92
+
After you obtain the token, you must specify the SAS URI pointing to the audio file requiring transcription. The rest of the code iterates through the status and displays results.
93
93
94
94
```cs
95
95
staticasyncTaskTranscribeAsync()
@@ -148,28 +148,27 @@ Once the token is obtained the developer must specifiy the SAS Uri pointing to t
148
148
```
149
149
150
150
> [!NOTE]
151
-
> The subscription key mentioned in the above code snippet is the key from the Speech(Preview) resource that you create on Azure portal. Keys obtained from the Custom Speech Service resource will not work.
151
+
> In the preceding code, the subscription key is from the Speech(Preview) resource that you create on the Azure portal. Keys obtained from the Custom Speech Service resource do not work.
152
152
153
+
Notice the asynchronous setup for posting audio and receiving transcription status. The client created is a .NET Http client. There is a `PostTranscriptions` method for sending the audio file details, and a `GetTranscriptions` method to receive the results. `PostTranscriptions` returns a handle, and `GetTranscriptions` uses this handle to create a handle to obtain the transcription status.
153
154
154
-
Notice the asynchronous setup for posting audio and receiving transcription status. The client created is a .NET Http client. There is a `PostTranscriptions` method for sending the audio file details, and a `GetTranscriptions` method to receive the results. `PostTranscriptions` returns a handle, and `GetTranscriptions`method is using this handle to create a handle to obtain the transcription status.
155
+
The current sample code does not specify any custom models. The service uses the baseline models for transcribing the file or files. To specify the models, you can pass on the same method the model IDs for the acoustic and the language model.
155
156
156
-
The current sample code does not specify any custom models. The service will use the baseline models for transcribing the file(s). If the user wishes to specify the models, one can pass on the same method the modelIDs for the acoustic and the language model.
157
-
158
-
If one does not wish to use baseline, one must pass model Ids for both acoustic and language models.
157
+
If you don't want to use the baseline, you must pass model Ids for both acoustic and language models.
159
158
160
159
> [!NOTE]
161
-
> For baseline transcription the user does not have to declare the Endpoints of the baseline models. If the user wants to use custom models he would have to provide their endpoints IDs as the [Sample](https://github.com/PanosPeriorellis/Speech_Service-BatchTranscriptionAPI). If user wants to use an acoustic baseline with a baseline language model then he would only have to declare the custom model's endpoint ID. Internally our system will figure out the partner baseline model (be it acoustic or language) and use that to fulfill the transcription request.
160
+
> For baseline transcription, you don't have to declare the endpoints of the baseline models. If you want to use custom models, you provide their endpoints IDs as the [Sample](https://github.com/PanosPeriorellis/Speech_Service-BatchTranscriptionAPI). If you want to use an acoustic baseline with a baseline language model, you only have to declare the custom model's endpoint ID. Microsoft detects the partner baseline model (be it acoustic or language), and uses that to fulfill the transcription request.
162
161
163
162
### Supported storage
164
163
165
-
Currently the only storage supported is Azure blob.
164
+
Currently the only storage supported is Azure Blob storage.
166
165
167
166
## Downloading the sample
168
167
169
-
The sample displayed here is on [GitHub](https://github.com/PanosPeriorellis/Speech_Service-BatchTranscriptionAPI).
168
+
The sample shown here is on [GitHub](https://github.com/PanosPeriorellis/Speech_Service-BatchTranscriptionAPI).
170
169
171
170
> [!NOTE]
172
-
> Typically an audio transcription requires a time span equal to the duration of the audio file plus a 2-3 minute overhead.
171
+
> Typically, an audio transcription requires a time span equal to the duration of the audio file, plus a 2-3 minute overhead.
* Improved the accuracy of speech recognition by fixing a bug in the audio processing code.
20
-
* Updated the [Speech SDK](https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-sdk-reference) component to version 0.5.0, see its
* Updated the [Speech SDK](https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-sdk-reference) component to version 0.5.0. For more information, see its
This file provides informationregarding components that are being relicensed to you by Microsoft under Microsoft's software licensing terms. Microsoft reserves all rights not expressly granted herein, whether by implication, estoppel or otherwise.
18
+
This file provides information regarding components that are being relicensed to you by Microsoft under Microsoft's software licensing terms. Microsoft reserves all rights not expressly granted herein, whether by implication, estoppel or otherwise.
19
19
20
-
Microsoft is offering you a license to use the following components with Speech Devices SDKsubject to the terms of the Microsoft software license terms for Speech Devices SDK products (the “Microsoft Program”).
20
+
Microsoft is offering you a license to use the following components with Speech Devices SDK subject to the terms of the Microsoft software license terms for Speech Devices SDK products (the “Microsoft Program”).
0 commit comments