Skip to content

Commit 13f0b1b

Browse files
committed
dev lead feedback
1 parent b48d01c commit 13f0b1b

File tree

1 file changed

+73
-62
lines changed

1 file changed

+73
-62
lines changed

articles/cognitive-services/Speech-Service/migrate-v3-0-to-v3-1.md

Lines changed: 73 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,79 @@ The Speech-to-text REST API is used for [Batch transcription](batch-transcriptio
2121
> [!IMPORTANT]
2222
> Speech-to-text REST API v3.1 is currently in public preview. Once it's generally available, version 3.0 of the [Speech to Text REST API](rest-speech-to-text.md) will be deprecated.
2323
24-
## Path and operation IDs
24+
## Base path
25+
26+
You must update the base path in your code from `/speechtotext/v3.0` to `/speechtotext/v3.1-preview.1`. For example, to get base models in the `eastus` region, use `https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/models/base` instead of `https://eastus.api.cognitive.microsoft.com/speechtotext/v3.0/models/base`.
27+
28+
Please note these additional changes:
29+
- The `/models/{id}/copyto` operation (includes '/') in version 3.0 is replaced by the `/models/{id}:copyto` operation (includes ':') in version 3.1.
30+
- The `/webhooks/{id}/ping` operation (includes '/') in version 3.0 is replaced by the `/webhooks/{id}:ping` operation (includes ':') in version 3.1.
31+
- The `/webhooks/{id}/test` operation (includes '/') in version 3.0 is replaced by the `/webhooks/{id}:test` operation (includes ':') in version 3.1.
32+
33+
For more details, see [Operation IDs](#operation-ids) later in this guide.
34+
35+
## Batch transcription
36+
37+
> [!NOTE]
38+
> Don't use Speech-to-text REST API v3.0 to retrieve a transcription created via Speech-to-text REST API v3.1. You'll see an error message such as the following: "The API version cannot be used to access this transcription. Please use API version v3.1 or higher."
39+
40+
In the [Transcriptions_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Transcriptions_Create) operation the following three properties are added:
41+
- The `displayFormWordLevelTimestampsEnabled` property can be used to enable the reporting of word-level timestamps on the display form of the transcription results. The results are returned in the `displayPhraseElements` property of the transcription file.
42+
- The `diarization` property can be used to specify hints for the minimum and maximum number of speaker labels to generate when performing optional diarization (speaker separation). With this feature, the service is now able to generate speaker labels for more than two speakers. The `diarizationEnabled` property is deprecated and will be removed in the next major version of the API.
43+
- The `languageIdentification` property can be used specify settings for language identification on the input prior to transcription. Up to 10 candidate locales are supported for language identification. The returned transcription will include a new `locale` property for the recognized language or the locale that you provided.
44+
45+
The `filter` property is added to the [Transcriptions_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Transcriptions_List), [Transcriptions_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Transcriptions_ListFiles), and [Projects_ListTranscriptions](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListTranscriptions) operations. The `filter` expression can be used to select a subset of the available resources. You can filter by `displayName`, `description`, `createdDateTime`, `lastActionDateTime`, `status`, and `locale`. For example: `filter=createdDateTime gt 2022-02-01T11:00:00Z`
46+
47+
## Custom Speech
48+
49+
### Datasets
50+
51+
The following operations are added for uploading and managing multiple data blocks for a dataset:
52+
- [Datasets_UploadBlock](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_UploadBlock) - Upload a block of data for the dataset. The maximum size of the block is 8MiB.
53+
- [Datasets_GetDatasetBlocks](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_GetDatasetBlocks) - Get the list of uploaded blocks for this dataset.
54+
- [Datasets_CommitBlocks](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_CommitBlocks) - Commit block list to complete the upload of the dataset.
55+
56+
To support model adaptation with [structured text in markdown](how-to-custom-speech-test-and-train.md#structured-text-data-for-training) data, the [Datasets_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_Create) operation now supports the **LanguageMarkdown** data kind. For more information, see [upload datasets](how-to-custom-speech-upload-data.md#upload-datasets).
57+
58+
### Models
59+
60+
The [Models_ListBaseModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListBaseModels) and [Models_ListBaseModel](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListBaseModel) operations return information on the type of adaptation supported by each base model.
61+
62+
```json
63+
"features": {
64+
"supportsAdaptationsWith": [
65+
"Acoustic",
66+
"Language",
67+
"LanguageMarkdown",
68+
"Pronunciation"
69+
]
70+
}
71+
```
72+
73+
The [Models_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_Create) operation has a new `customModelWeightPercent` property where you can specify the weight used when the Custom Language Model (trained from plain or structured text data) is combined with the Base Language Model. Valid values are integers between 1 and 100. The default value is currently 30.
74+
75+
The `filter` property is added to the following operations:
76+
77+
- [Datasets_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_List)
78+
- [Datasets_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_ListFiles)
79+
- [Endpoints_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Endpoints_List)
80+
- [Evaluations_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Evaluations_List)
81+
- [Evaluations_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Evaluations_ListFiles)
82+
- [Models_ListBaseModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListBaseModels)
83+
- [Models_ListCustomModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListCustomModels)
84+
- [Projects_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_List)
85+
- [Projects_ListDatasets](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListDatasets)
86+
- [Projects_ListEndpoints](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListEndpoints)
87+
- [Projects_ListEvaluations](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListEvaluations)
88+
- [Projects_ListModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListModels)
89+
90+
The `filter` expression can be used to select a subset of the available resources. You can filter by `displayName`, `description`, `createdDateTime`, `lastActionDateTime`, `status`, `locale`, and `kind`. For example: `filter=locale eq 'en-US'`
91+
92+
Added the [Models_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListFiles) operation to get the files of the model identified by the given ID.
93+
94+
Added the [Models_GetFile](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_GetFile) operation to get one specific file (identified with fileId) from a model (identified with id). This lets you retrieve a **ModelReport** file that provides information on the data processed during training.
95+
96+
## Operation IDs
2597

2698
You must update the base path in your code from `/speechtotext/v3.0` to `/speechtotext/v3.1`. For example, to get base models in the `eastus` region, use `https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/models/base` instead of `https://eastus.api.cognitive.microsoft.com/speechtotext/v3.0/models/base`.
2799

@@ -110,67 +182,6 @@ The name of each `operationId` in version 3.1 is prefixed with the object name.
110182

111183
<sup>3</sup> The `/webhooks/{id}/test` operation (includes '/') in version 3.0 is replaced by the `/webhooks/{id}:test` operation (includes ':') in version 3.1.
112184

113-
## Batch transcription
114-
115-
> [!WARNING]
116-
> Don't use Speech-to-text REST API v3.0 to retrieve a transcription created via Speech-to-text REST API v3.1. You'll see an error message such as the following: "The API version cannot be used to access this transcription. Please use API version v3.1 or higher."
117-
118-
In the [Transcriptions_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Transcriptions_Create) operation the following three properties are added:
119-
- The `displayFormWordLevelTimestampsEnabled` property can be used to enable the reporting of word-level timestamps on the display form of the transcription results. The results are returned in the `displayPhraseElements` property of the transcription file.
120-
- The `diarization` property can be used to specify hints for the minimum and maximum number of speaker labels to generate when performing optional diarization (speaker separation). With this feature, the service is now able to generate speaker labels for more than two speakers. The `diarizationEnabled` property is deprecated and will be removed in the next major version of the API.
121-
- The `languageIdentification` property can be used specify settings for language identification on the input prior to transcription. Up to 10 candidate locales are supported for language identification. The returned transcription will include a new `locale` property for the recognized language or the locale that you provided.
122-
123-
The `filter` property is added to the [Transcriptions_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Transcriptions_List), [Transcriptions_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Transcriptions_ListFiles), and [Projects_ListTranscriptions](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListTranscriptions) operations. The `filter` expression can be used to select a subset of the available resources. You can filter by `displayName`, `description`, `createdDateTime`, `lastActionDateTime`, `status`, and `locale`. For example: `filter=createdDateTime gt 2022-02-01T11:00:00Z`
124-
125-
## Custom Speech
126-
127-
### Datasets
128-
129-
The following operations are added for uploading and managing multiple data blocks for a dataset:
130-
- [Datasets_UploadBlock](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_UploadBlock) - Upload a block of data for the dataset. The maximum size of the block is 8MiB.
131-
- [Datasets_GetDatasetBlocks](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_GetDatasetBlocks) - Get the list of uploaded blocks for this dataset.
132-
- [Datasets_CommitBlocks](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_CommitBlocks) - Commit block list to complete the upload of the dataset.
133-
134-
To support model adaptation with [structured text in markdown](how-to-custom-speech-test-and-train.md#structured-text-data-for-training) data, the [Datasets_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_Create) operation now supports the **LanguageMarkdown** data kind. For more information, see [upload datasets](how-to-custom-speech-upload-data.md#upload-datasets).
135-
136-
### Models
137-
138-
The [Models_ListBaseModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListBaseModels) and [Models_ListBaseModel](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListBaseModel) operations return information on the type of adaptation supported by each base model.
139-
140-
```json
141-
"features": {
142-
"supportsAdaptationsWith": [
143-
"Acoustic",
144-
"Language",
145-
"LanguageMarkdown",
146-
"Pronunciation"
147-
]
148-
}
149-
```
150-
151-
The [Models_Create](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_Create) operation has a new `customModelWeightPercent` property where you can specify the weight used when the Custom Language Model (trained from plain or structured text data) is combined with the Base Language Model. Valid values are integers between 1 and 100. The default value is currently 30.
152-
153-
The `filter` property is added to the following operations:
154-
155-
- [Datasets_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_List)
156-
- [Datasets_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Datasets_ListFiles)
157-
- [Endpoints_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Endpoints_List)
158-
- [Evaluations_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Evaluations_List)
159-
- [Evaluations_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Evaluations_ListFiles)
160-
- [Models_ListBaseModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListBaseModels)
161-
- [Models_ListCustomModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListCustomModels)
162-
- [Projects_List](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_List)
163-
- [Projects_ListDatasets](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListDatasets)
164-
- [Projects_ListEndpoints](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListEndpoints)
165-
- [Projects_ListEvaluations](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListEvaluations)
166-
- [Projects_ListModels](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Projects_ListModels)
167-
168-
The `filter` expression can be used to select a subset of the available resources. You can filter by `displayName`, `description`, `createdDateTime`, `lastActionDateTime`, `status`, `locale`, and `kind`. For example: `filter=locale eq 'en-US'`
169-
170-
Added the [Models_ListFiles](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_ListFiles) operation to get the files of the model identified by the given ID.
171-
172-
Added the [Models_GetFile](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-1-preview1/operations/Models_GetFile) operation to get one specific file (identified with fileId) from a model (identified with id). This lets you retrieve a **ModelReport** file that provides information on the data processed during training.
173-
174185
## Next steps
175186

176187
* [Speech-to-text REST API](rest-speech-to-text.md)

0 commit comments

Comments
 (0)