You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/fast-transcription-create.md
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,11 +25,13 @@ Fast transcription API is used to transcribe audio files with returning results
25
25
26
26
## Prerequisites
27
27
28
-
-A Speech resource in one of the regions where the fast transcription API is available. The supported regions are: Australia East, Brazil South, Central India, East US, East US 2, Japan East, North Central US, North Europe, South Central US, Southeast Asia, Sweden Central, West Europe, West US, and West US 2. For more information about regions supported for other Speech service features, see [Speech service regions](./regions.md).
28
+
-An Azure AI Speech resource in one of the regions where the fast transcription API is available. The supported regions are: Australia East, Brazil South, Central India, East US, East US 2, Japan East, North Central US, North Europe, South Central US, Southeast Asia, Sweden Central, West Europe, West US, and West US 2. For more information about regions supported for other Speech service features, see [Speech service regions](./regions.md).
29
29
- An audio file (less than 2 hours long and less than 200 MB in size) in one of the supported formats and codecs: WAV, MP3, OPUS/OGG, FLAC, WMA, AAC, ALAW in WAV container, MULAW in WAV container, AMR, WebM, M4A, and SPEEX.
30
30
31
31
## Use the fast transcription API
32
32
33
+
The fast transcription API is a REST API that uses multipart/form-data to submit audio files for transcription. The API returns the transcription results synchronously.
34
+
33
35
Construct the request body according to the following instructions:
34
36
35
37
- Set the required `inputLocales` property. This value should match the expected locale of the audio data to transcribe. The supported locales are: en-US, es-ES, es-MX, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, pt-BR, and zh-CN.
The response will include `timestamp`, `durationInTicks`, `duration`, and more.
64
66
- The `combinedRecognizedPhrases` property contains the full transcriptions for each channel separately. For example, everything the first speaker said is in the first element of the `combinedRecognizedPhrases` array, and everything the second speaker said is in the second element of the array.
65
67
- Since we specified `wordLevelTimestampsEnabled` as `true`, the response will include word-level timestamps.
66
-
-
67
68
68
69
```json
69
70
{
@@ -204,7 +205,7 @@ You can compare transcription results with the [speech to text real-time API](./
204
205
- The real-time API is limited to 60 seconds of audio. The fast transcription API is designed for longer audio files and returns results much faster than real-time audio.
205
206
- The real-time API doesn't support channel separation. The fast transcription API supports channel separation and returns results for each channel separately.
206
207
207
-
Here's an example request:
208
+
Here's an example transcription request using the [speech to text real-time API](./rest-speech-to-text-short.md).
208
209
209
210
- Replace `YourSubscriptionKey` with your Speech resource key.
210
211
- Replace `YourServiceRegion` with your Speech resource region.
@@ -218,7 +219,7 @@ curl --location --request POST \
218
219
--data-binary YourAudioFile
219
220
```
220
221
221
-
Here's an example transcription response using the [speech to text real-time API](./rest-speech-to-text-short.md). Only the first 60 seconds of the provided audio file is transcribed to text.
222
+
Here's an example response. Only the first 60 seconds of the provided audio file is transcribed to text.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/overview.md
+12Lines changed: 12 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,6 +58,18 @@ With [real-time speech to text](get-started-speech-to-text.md), the audio is tra
58
58
- Dictation
59
59
- Voice agents
60
60
61
+
## Fast transcription API (Preview)
62
+
63
+
Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
64
+
65
+
- Quick audio or video transcription, subtitles, and edit.
66
+
- Video dubbing
67
+
68
+
> [!NOTE]
69
+
> Fast transcription API is only available via the speech to text REST API version 3.3.
70
+
71
+
To get started with fast transcription, see [use the fast transcription API (preview)](fast-transcription-create.md).
72
+
61
73
### Batch transcription
62
74
63
75
[Batch transcription](batch-transcription.md) is used to transcribe a large amount of audio in storage. You can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcription results. Use batch transcription for applications that need to transcribe audio in bulk such as:
0 commit comments