Skip to content

Commit b1c862c

Browse files
committed
new STT API version and fast transcription
1 parent 1a7f885 commit b1c862c

File tree

9 files changed

+1073
-124
lines changed

9 files changed

+1073
-124
lines changed

articles/ai-services/speech-service/fast-transcription-create.md

Lines changed: 931 additions & 92 deletions
Large diffs are not rendered by default.

articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,24 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 7/12/2024
5+
ms.date: 11/12/2024
66
ms.author: eur
77
---
88

9+
10+
### November 2024 release
11+
12+
#### Speech to text REST API version 2024-11-15
13+
14+
The speech to text REST API version 2024-11-15 is released for general availability. For more information, see the [speech to text REST API reference documentation](https://go.microsoft.com/fwlink/?linkid=2296107) and the [Speech to text REST API guide](../../rest-speech-to-text.md).
15+
16+
> [!NOTE]
17+
> The speech to text REST API version 2024-05-15-preview is deprecated.
18+
19+
#### Fast transcription (GA)
20+
21+
Fast transcription is now generally available via [speech to text REST API version 2024-11-15](https://go.microsoft.com/fwlink/?linkid=2296107). Fast transcription allows you to transcribe audio file to text accurately and synchronously, with a high speed factor. It can transcribe audio much faster than the actual audio duration. For more information, see the [fast transcription API guide](../../fast-transcription-create.md).
22+
923
### October 2024 release
1024

1125
#### Video translation (Preview)
@@ -14,7 +28,6 @@ The video translation API is now available in public preview. For more informati
1428

1529
### September 2024 release
1630

17-
1831
#### Real-time speech to text
1932

2033
[Real-time speech to text](../../how-to-recognize-speech.md) has released new models, with better quality, for the following languages.
@@ -76,7 +89,7 @@ Speech [pronunciation assessment](../../how-to-pronunciation-assessment.md) now
7689

7790
#### Fast Transcription API (Preview)
7891

79-
Fast transcription is now available in public preview. Fast transcription allows you to transcribe audio file to text accurately and synchronously, with a high speed factor. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](../../fast-transcription-create.md).
92+
Fast transcription is now available in public preview. Fast transcription allows you to transcribe audio file to text accurately and synchronously, with a high speed factor. It can transcribe audio much faster than the actual audio duration. For more information, see the [fast transcription API guide](../../fast-transcription-create.md).
8093

8194
> [!TIP]
8295
> Try out fast transcription in [Azure AI Studio](https://aka.ms/fasttranscription/studio).
@@ -88,7 +101,7 @@ Fast transcription is now available in public preview. Fast transcription allows
88101
The Speech to text REST API version 3.2 is now generally available. For more information about speech to text REST API v3.2, see the [Speech to text REST API v3.2 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.2&preserve-view=true) and the [Speech to text REST API guide](../../rest-speech-to-text.md).
89102

90103
> [!NOTE]
91-
> Preview versions *3.2-preview.1* and *3.2-preview.2* will be removed in September 2024.
104+
> Preview versions *3.2-preview.1* and *3.2-preview.2* are retired as of September 2024.
92105
93106
[Speech to text REST API](../../rest-speech-to-text.md) v3.1 will be retired on a date to be announced. Speech to text REST API v3.0 will be retired on April 1st, 2026. For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](../../migrate-v3-0-to-v3-1.md) and [v3.1 to v3.2](../../migrate-v3-1-to-v3-2.md) migration guides.
94107

articles/ai-services/speech-service/index.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ metadata:
1212
manager: nitinme
1313
ms.service: azure-ai-speech
1414
ms.topic: hub-page
15-
ms.date: 8/20/2024
15+
ms.date: 11/12/2024
1616
ms.author: eur
1717

1818
highlightedContent:
@@ -116,8 +116,8 @@ conceptualContent:
116116
text: Migrate to neural voice
117117
url: migration-overview-neural-voice.md
118118
- itemType: how-to-guide
119-
text: Migrate to the v3.2 REST API
120-
url: migrate-v3-1-to-v3-2.md
119+
text: Migrate to speech to text REST API version 2024-11-15
120+
url: migrate-2024-11-15.md
121121
- itemType: how-to-guide
122122
text: Migrate to Batch synthesis REST API
123123
url: migrate-to-batch-synthesis.md
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
title: Migrate code from v3.2 to version 2024-11-15 - Speech service
3+
titleSuffix: Azure AI services
4+
description: This document helps developers migrate code from v3.2 to version 2024-11-15 of the Speech to text REST API.
5+
author: eric-urban
6+
ms.author: eur
7+
manager: nitinme
8+
ms.service: azure-ai-speech
9+
ms.topic: how-to
10+
ms.date: 11/12/2024
11+
#Customer intent: As a developer, I want to migrate code from v3.2 to version 2024-11-15 of the Speech to text REST API.
12+
---
13+
14+
# Migrate code from v3.2 to version 2024-11-15
15+
16+
The Speech to text REST API is used for [fast transcription](./fast-transcription-create.md), [batch transcription](batch-transcription.md), and [custom speech](custom-speech-overview.md). This article describes changes from version 3.2 to version 2024-11-15.
17+
18+
> [!IMPORTANT]
19+
> Speech to text REST API version `2024-11-15` is the latest version that's generally available.
20+
> - [Speech to text REST API](rest-speech-to-text.md) version `2024-05-15-preview` will be retired on a date to be announced.
21+
> - Speech to text REST API `v3.0`, `v3.1`, `v3.2`, `3.2-preview.1`, and `3.2-preview.2` will be retired on April 1st, 2026.
22+
>
23+
> For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](migrate-v3-0-to-v3-1.md), [v3.1 to v3.2](migrate-v3-1-to-v3-2.md), and [v3.2 to 2024-11-15](migrate-2024-11-15.md) migration guides.
24+
25+
## Base path
26+
27+
Custom speech API switched from a path based versioning scheme to a query parameter based scheme in alignment with general Azure API versioning schemes. This required changes to the used base path. Update path from `/speechtotext/v3.2` to `/speechtotext` and append API version with `?api-version=2024-11-15` to all requests.
28+
29+
## Datasets
30+
31+
The `email` property and the connected email notification process is removed from the API.
32+
33+
The `duration` property in dataset responses is renamed from `duration` to `durationMilliseconds` and are now a plain number instead of an ISO8601 formatted string (P1D2H3M4S…) to further simply processing.
34+
35+
The query parameter `sasValidityInSeconds` is renamed to `sasLifetimeMinutes` for getting files. Usage is only allowed for an account with BYOS disabled. For BYOS enabled accounts, SAS URLs aren't returned.
36+
37+
The `project` property is removed in creation requests.
38+
39+
## Models
40+
41+
Removed the `text` property in a model creation request. The alternative is to create a dataset with the text content and create a dataset first, which then is later on used for model creation.
42+
43+
The `email` property and the connected email notification process is removed from the API.
44+
45+
The query parameter `sasValidityInSeconds` is renamed to `sasLifetimeMinutes` for getting files. Usage is only allowed for an account with BYOS (bring your own storage) disabled. For BYOS enabled accounts, SAS URLs aren't returned.
46+
47+
The `GET models/id/manifest` operation now always requires a nonzero SAS lifetime. The corresponding `sasValidityInSeconds` property is renamed to `sasLifetimeMinutes`.
48+
49+
The `project` property is removed in creation requests.
50+
51+
## Evaluations
52+
53+
The query parameter `sasValidityInSeconds` is renamed to `sasLifetimeMinutes` for getting files. Usage is only allowed for an account with BYOS disabled. For BYOS enabled accounts, SAS URLs aren't returned.
54+
55+
The `project` property is removed in creation requests
56+
57+
The `email` property and the connected email notification process is removed from the API.
58+
59+
## Endpoints
60+
61+
The API to retrieve and delete log files of endpoint logs is removed. Custom speech now supports BYOS (bring your own storage). Only accounts with BYOS enabled can enable logging on model endpoints. This offers full manageability of log files on customer storage instead of a proxy API.
62+
63+
Removed support for `timeToLive` in endpoint creations.
64+
65+
Removed the `text` property in an endpoint creation request. The alternative is to create a dataset with the text content and create a dataset first, which then is later on used for model creation. This model can then be used to create an endpoint.
66+
67+
Endpoint links now only return endpoint of websocket connection, used for SDK.
68+
69+
The `project` property is removed in creation requests.
70+
71+
The `email` property and the connected email notification process is removed from the API.
72+
73+
## Transcriptions
74+
75+
Removed the top-level `diarizationEnabled` property of a transcription. The diarization configuration is simplified to `"diarization": {"maxSpeakers": 2,"enabled": true}`. The `maxSpeakers` property is optional and defaults to 2. The `enabled` property is required for diarization.
76+
77+
Transcription creation: `timeToLive` renamed to `timeToLiveHours` including a format change from ISO8601 formatted string to a simple int (number of hours).
78+
79+
The `duration` property in transcription responses is renamed from `duration` to `durationMilliseconds` and are now a plain number instead of an ISO8601 formatted string (P1D2H3M4S…) to further simplify processing. Transcription result files have this property added for consistency with API.
80+
81+
The query parameter `sasValidityInSeconds` is renamed to `sasLifetimeMinutes` for getting files. Usage is only allowed for an account with BYOS disabled. For BYOS enabled accounts, SAS URLs aren't returned.
82+
83+
The `project` property is removed in creation requests.
84+
85+
The `email` property and the connected email notification process is removed from the API.
86+
87+
## Projects
88+
89+
The projects API is removed.
90+
91+
## Next steps
92+
93+
* [Speech to text REST API](rest-speech-to-text.md)
94+
* [Speech to text REST API 2024-11-15 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-2024-11-15&preserve-view=true)

articles/ai-services/speech-service/migrate-v3-0-to-v3-1.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: eur
77
manager: nitinme
88
ms.service: azure-ai-speech
99
ms.topic: how-to
10-
ms.date: 9/20/2024
10+
ms.date: 11/12/2024
1111
ms.reviewer: heikora
1212
ms.devlang: csharp
1313
ms.custom: devx-track-csharp
@@ -16,12 +16,14 @@ ms.custom: devx-track-csharp
1616

1717
# Migrate code from v3.0 to v3.1 of the REST API
1818

19-
The Speech to text REST API is used for [Batch transcription](batch-transcription.md) and [custom speech](custom-speech-overview.md). Changes from version 3.0 to 3.1 are described in the sections below.
19+
The Speech to text REST API is used for [fast transcription](./fast-transcription-create.md), [batch transcription](batch-transcription.md), and [custom speech](custom-speech-overview.md). Changes from version 3.0 to 3.1 are described in the sections below.
2020

2121
> [!IMPORTANT]
22-
> Speech to text REST API v3.2 is the latest version that's generally available. Preview versions *3.2-preview.1* and *3*.2-preview.2* will be removed in September 2024.
23-
> [Speech to text REST API](rest-speech-to-text.md) v3.1 will be retired on a date to be announced.
24-
> Speech to text REST API v3.0 will be retired on April 1st, 2026.
22+
> Speech to text REST API version `2024-11-15` is the latest version that's generally available.
23+
> - [Speech to text REST API](rest-speech-to-text.md) version `2024-05-15-preview` will be retired on a date to be announced.
24+
> - Speech to text REST API `v3.0`, `v3.1`, `v3.2`, `3.2-preview.1`, and `3.2-preview.2` will be retired on April 1st, 2026.
25+
>
26+
> For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](migrate-v3-0-to-v3-1.md), [v3.1 to v3.2](migrate-v3-1-to-v3-2.md), and [v3.2 to 2024-11-15](migrate-2024-11-15.md) migration guides.
2527
2628
## Base path
2729

articles/ai-services/speech-service/migrate-v3-1-to-v3-2.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: how-to
9-
ms.date: 9/20/2024
9+
ms.date: 11/12/2024
1010
ms.author: eur
1111
ms.devlang: csharp
1212
ms.custom: devx-track-csharp
@@ -15,12 +15,14 @@ ms.custom: devx-track-csharp
1515

1616
# Migrate code from v3.1 to v3.2 of the REST API
1717

18-
The Speech to text REST API is used for [Batch transcription](batch-transcription.md) and [custom speech](custom-speech-overview.md). This article describes changes from version 3.1 to 3.2.
18+
The Speech to text REST API is used for [fast transcription](./fast-transcription-create.md), [batch transcription](batch-transcription.md), and [custom speech](custom-speech-overview.md). This article describes changes from version 3.1 to 3.2.
1919

2020
> [!IMPORTANT]
21-
> Speech to text REST API v3.2 is the latest version that's generally available. Preview versions *3.2-preview.1* and *3*.2-preview.2* will be removed in September 2024.
22-
> [Speech to text REST API](rest-speech-to-text.md) v3.1 will be retired on a date to be announced.
23-
> Speech to text REST API v3.0 will be retired on April 1st, 2026.
21+
> Speech to text REST API version `2024-11-15` is the latest version that's generally available.
22+
> - [Speech to text REST API](rest-speech-to-text.md) version `2024-05-15-preview` will be retired on a date to be announced.
23+
> - Speech to text REST API `v3.0`, `v3.1`, `v3.2`, `3.2-preview.1`, and `3.2-preview.2` will be retired on April 1st, 2026.
24+
>
25+
> For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](migrate-v3-0-to-v3-1.md), [v3.1 to v3.2](migrate-v3-1-to-v3-2.md), and [v3.2 to 2024-11-15](migrate-2024-11-15.md) migration guides.
2426
2527
## Base path
2628

articles/ai-services/speech-service/releasenotes.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: eric-urban
77
ms.author: eur
88
ms.service: azure-ai-speech
99
ms.topic: release-notes
10-
ms.date: 10/9/2024
10+
ms.date: 11/12/2024
1111
ms.custom: references_regions
1212
# Customer intent: As a developer, I want to learn about new releases and features for Azure AI Speech.
1313
---
@@ -18,9 +18,9 @@ Azure AI Speech is updated on an ongoing basis. To stay up-to-date with recent d
1818

1919
## Recent highlights
2020

21+
* Fast transcription is now generally available. It can transcribe audio much faster than the actual audio duration. For more information, see the [fast transcription API guide](fast-transcription-create.md).
2122
* Azure AI Speech Toolkit extension is now available for Visual Studio Code users. It contains a list of speech quick-starts and scenario samples that can be easily built and run with simple clicks. For more information, see [Azure AI Speech Toolkit in Visual Studio Code Marketplace](https://aka.ms/speech-toolkit-vscode).
2223
* Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. For more information, see [What are Azure AI Speech high definition (HD) voices?](high-definition-voices.md).
23-
* Fast transcription is now available in public preview. It can transcribe audio much faster than the actual audio length. For more information, see the [fast transcription API guide](fast-transcription-create.md).
2424
* Video translation is now available in the Azure AI Speech service. For more information, see [What is video translation?](./video-translation-overview.md).
2525
* The Azure AI Speech service supports OpenAI text to speech voices. For more information, see [What are OpenAI text to speech voices?](./openai-voices.md).
2626
* The custom voice API is available for creating and managing [professional](./professional-voice-create-project.md) and [personal](./personal-voice-create-project.md) custom neural voice models.

articles/ai-services/speech-service/rest-speech-to-text.md

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: Get reference documentation for Speech to text REST API.
55
manager: nitinme
66
ms.service: azure-ai-speech
77
ms.topic: reference
8-
ms.date: 9/23/2024
8+
ms.date: 11/12/2024
99
ms.reviewer: eur
1010
author: eric-urban
1111
ms.author: eur
@@ -17,22 +17,18 @@ ms.author: eur
1717
Speech to text REST API is used for [batch transcription](batch-transcription.md) and [custom speech](custom-speech-overview.md).
1818

1919
> [!IMPORTANT]
20-
> Speech to text REST API v3.2 is the latest version that's generally available. Preview versions *3.2-preview.1* and *3*.2-preview.2* will be removed in September 2024.
21-
> [Speech to text REST API](rest-speech-to-text.md) v3.1 will be retired on a date to be announced. For more information about upgrading, see the Speech to text REST API [v3.1 to v3.2](migrate-v3-1-to-v3-2.md) migration guide.
22-
> Speech to text REST API v3.0 will be retired on April 1st, 2026. For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](migrate-v3-0-to-v3-1.md) and [v3.1 to v3.2](migrate-v3-1-to-v3-2.md) migration guides.
20+
> Speech to text REST API version `2024-11-15` is the latest version that's generally available.
21+
> - [Speech to text REST API](rest-speech-to-text.md) version `2024-05-15-preview` will be retired on a date to be announced.
22+
> - Speech to text REST API `v3.0`, `v3.1`, `v3.2`, `3.2-preview.1`, and `3.2-preview.2` will be retired on April 1st, 2026.
23+
>
24+
> For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](migrate-v3-0-to-v3-1.md), [v3.1 to v3.2](migrate-v3-1-to-v3-2.md), and [v3.2 to 2024-11-15](migrate-2024-11-15.md) migration guides.
2325
2426
> [!div class="nextstepaction"]
25-
> [See the Speech to text REST API 2024-05-15 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-2024-05-15-preview&preserve-view=true)
26-
27-
> [!div class="nextstepaction"]
28-
> [See the Speech to text REST API v3.2 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.2&preserve-view=true)
29-
30-
> [!div class="nextstepaction"]
31-
> [See the Speech to text REST API v3.1 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.1&preserve-view=true)
27+
> [See the Speech to text REST API 2024-11-15 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-2024-11-15&preserve-view=true)
3228
3329
Use Speech to text REST API to:
3430

35-
- [Fast transcription](fast-transcription-create.md): Transcribe audio files with returning results synchronously and much faster than real-time audio. Use the fast transcription API ([/speechtotext/transcriptions:transcribe](/rest/api/speechtotext/transcriptions/transcribe)) in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as quick audio or video transcription or video translation.
31+
- [Fast transcription](fast-transcription-create.md): Transcribe audio files with returning results synchronously and much faster than real-time audio. Use the fast transcription API ([/speechtotext/transcriptions:transcribe](https://go.microsoft.com/fwlink/?linkid=2296107)) in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as quick audio or video transcription or video translation.
3632
- [Custom speech](custom-speech-overview.md): Upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint. Copy models to other subscriptions if you want colleagues to have access to a model that you built, or if you want to deploy a model to more than one region.
3733
- [Batch transcription](batch-transcription.md): Transcribe audio files as a batch from multiple URLs or an Azure container.
3834

articles/ai-services/speech-service/toc.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ items:
4040
href: get-speech-recognition-results.md
4141
- name: Real-time diarization quickstart
4242
href: get-started-stt-diarization.md
43-
- name: Fast transcription API (Preview)
43+
- name: Fast transcription API
4444
href: fast-transcription-create.md
4545
- name: Batch transcription API
4646
items:
@@ -452,6 +452,9 @@ items:
452452
href: migrate-to-custom-voice-api.md
453453
- name: Speech to text REST API migration
454454
items:
455+
- name: From Speech to text v3.2 to 2024-11-15
456+
href: migrate-2024-11-15.md
457+
displayName: migrate,migration,deprecate,retire,sunset
455458
- name: From Speech to text v3.1 to v3.2
456459
href: migrate-v3-1-to-v3-2.md
457460
displayName: migrate,migration,deprecate,retire,sunset

0 commit comments

Comments
 (0)