Skip to content

Commit 5dab438

Browse files
author
Baolian Zou (Shanghai Wicresoft Co Ltd)
committed
Update video translation docs again
1 parent 7341d04 commit 5dab438

File tree

6 files changed

+351
-30
lines changed

6 files changed

+351
-30
lines changed
Lines changed: 302 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
---
2+
title: include file
3+
description: include file
4+
author: sally-baolian
5+
ms.author: v-baolianzou
6+
ms.service: azure-ai-speech
7+
ms.topic: include
8+
ms.date: 10/11/2024
9+
ms.custom: include
10+
---
11+
12+
The video translation REST API facilitates seamless video translation integration into your applications. It supports uploading, managing, and refining video translations, with multiple iterations for continuous improvement. In this article, you learn how to utilize video translation through the REST API.
13+
14+
This diagram provides a high-level overview of the workflow.
15+
16+
![Diagram of video translation API workflow.](../../../media/video-translation/video-translation-api-workflow.png)
17+
18+
You can use the following REST API operations for video translation:
19+
20+
| Operation | Method | REST API call |
21+
| ----------------------------------------------------- | ------ | ------------------------------------------------- |
22+
| [Create a translation](/rest/api/aiservices/videotranslation/translation-operations/create-translation) | `PUT` | `/translations/{translationId}` |
23+
| [List translations](/rest/api/aiservices/videotranslation/translation-operations/list-translation) | `GET` | `/translations` |
24+
| [Get a translation by translation ID](/rest/api/aiservices/videotranslation/translation-operations/get-translation) | `GET` | `/translations/{translationId}` |
25+
| [Create an iteration](/rest/api/aiservices/videotranslation/iteration-operations/create-iteration) | `PUT` | `/translations/{translationId}/iterations/{iterationId}` |
26+
| [List iterations](/rest/api/aiservices/videotranslation/iteration-operations/list-iteration) | `GET` | `/translations/{translationId}/iterations` |
27+
| [Get an iteration by iteration ID](/rest/api/aiservices/videotranslation/iteration-operations/get-iteration) | `GET` | `/translations/{translationId}/iterations/{iterationId}` |
28+
| [Get operation by operation ID](/rest/api/aiservices/videotranslation/operation-operations/get-operation) | `GET` | `/operations/{operationId}` |
29+
| [Delete a translation by translation ID](/rest/api/aiservices/videotranslation/translation-operations/delete-translation) | `DELETE`| `/translations/{translationId}` |
30+
31+
For code samples, see [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/video-translation/csharp).
32+
33+
This article outlines the primary steps of the API process, including creating a translation, creating an iteration, checking the status of each operation, getting an iteration by iteration ID, and deleting a translation by translation ID. For complete details, refer to the links provided for each API in the table.
34+
35+
## Create a translation
36+
37+
To submit a video translation request, you need to construct an HTTP PUT request path and body according to the following instructions:
38+
39+
- Specify `Operation-Id`: The `Operation-Id` must be unique for each operation. It ensures that each operation is tracked separately. Replace `[operationId]` with an operation ID.
40+
- Specify `translationId`: The `translationId` must be unique. Replace `[translationId]` with a translation ID.
41+
- Set the required input: Include details like `sourceLocale`, `targetLocale`, `voiceKind`, and `videoFileUrl`. Ensure that you have the video URL from Azure Blob Storage. For the languages supported for video translation, refer to the [supported source and target languages](../../../language-support.md?tabs=speech-translation#video-translation). You can set `voiceKind` parameter to either `PlatformVoice` or `PersonalVoice`. For `PlatformVoice`, the system automatically selects the most suitable prebuilt voice by matching the speaker's voice in the video with prebuilt voices. For `PersonalVoice`, the system offers a model that generates high-quality voice replication in a few seconds.
42+
43+
>[!NOTE]
44+
> To use personal voice, you need to apply for [access](https://aka.ms/customneural).
45+
46+
- Replace `[YourResourceKey]` with your Speech resource key and replace `[YourSpeechRegion]` with your Speech resource region.
47+
48+
Creating a translation doesn't initiate the translation process. You can start translating the video by [creating an iteration](#create-an-iteration). The following example is for Windows shell. Ensure to escape `&` with `^&` if the URL contains `&`.
49+
50+
```azurecli-interactive
51+
curl -v -X PUT -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" -H "Operation-Id: [operationId]" -H "Content-Type: application/json" -d "{\"displayName\": \"[YourDisplayName]\",\"description\": \"[OptionalYourDescription]\",\"input\": {\"sourceLocale\": \"[VideoSourceLocale]\",\"targetLocale\": \"[TranslationTargetLocale]\",\"voiceKind\": \"[PlatformVoice/PersonalVoice]\",\"speakerCount\": [OptionalVideoSpeakerCount],\"subtitleMaxCharCountPerSegment\": [OptionalYourPreferredSubtitleMaxCharCountPerSegment],\"exportSubtitleInVideo\": [Optional true/false],\"videoFileUrl\": \"[AzureBlobUrlWithSas]\"}}" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations/[translationId]?api-version=2024-05-20-preview"
52+
```
53+
54+
>[!IMPORTANT]
55+
> Data created through the API won't appear in Speech Studio, and the data between the API and Speech Studio isn't synchronized.
56+
57+
You should receive a response body in the following format:
58+
59+
```json
60+
{
61+
"input": {
62+
"sourceLocale": "zh-CN",
63+
"targetLocale": "en-US",
64+
"voiceKind": "PlatformVoice",
65+
"speakerCount": 1,
66+
"subtitleMaxCharCountPerSegment": 30,
67+
"exportSubtitleInVideo": true
68+
},
69+
"status": "NotStarted",
70+
"lastActionDateTime": "2024-09-20T06:25:05.058Z",
71+
"id": "mytranslation0920",
72+
"displayName": "demo",
73+
"description": "for testing",
74+
"createdDateTime": "2024-09-20T06:25:05.058Z"
75+
}
76+
```
77+
The status property should progress from `NotStarted` status, to `Running`, and finally to `Succeeded` or `Failed`. You can call the [Get operation by operation ID](#get-operation-by-operation-id) API periodically until the returned status is `Succeeded` or `Failed`. This operation allows you to monitor the progress of your creating translation process.
78+
79+
## Get operation by operation ID
80+
81+
Check the status of a specific operation using its operation ID. The operation ID is unique for each operation, so you can track each operation separately.
82+
83+
Replace `[YourResourceKey]` with your Speech resource key, `[YourSpeechRegion]` with your Speech resource region, and `[operationId]` with the operation ID you want to check.
84+
85+
```azurecli-interactive
86+
curl -v -X GET -H "Ocp-Apim-Subscription-Key:[YourResourceKey]" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/operations/[operationId]?api-version=2024-05-20-preview"
87+
```
88+
89+
You should receive a response body in the following format:
90+
91+
```json
92+
{
93+
"id": "createtranslation0920-1",
94+
"status": "Running"
95+
}
96+
```
97+
98+
## Create an iteration
99+
100+
To start translating your video or update an iteration for an existing translation, you need to construct an HTTP PUT request path and body according to the following instructions:
101+
102+
- Specify `Operation-Id`: The `Operation-Id` must be unique for each operation, such as creating each iteration. Replace `[operationId]` with a unique ID for this operation.
103+
- Specify `translationId`: If multiple iterations are performed under a single translation, the translation ID remains unchanged.
104+
- Specify `iterationId`: The `iterationId` must be unique for each operation. Replace `[iterationId]` with an iteration ID.
105+
- Set the required input: Include details like `speakerCount`, `subtitleMaxCharCountPerSegment`,`exportSubtitleInVideo`, or `webvttFile`. No subtitles are embedded in the output video by default.
106+
- Replace `[YourResourceKey]` with your Speech resource key and replace `[YourSpeechRegion]` with your Speech resource region.
107+
108+
The following example is for Windows shell. Ensure to escape `&` with `^&` if the URL contains `&`.
109+
110+
```azurecli-interactive
111+
curl -v -X PUT -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" -H "Operation-Id: [operationId]" -H "Content-Type: application/json" -d "{\"input\": {\"speakerCount\": [OptionalVideoSpeakerCount],\"subtitleMaxCharCountPerSegment\": [OptionalYourPreferredSubtitleMaxCharCountPerSegment],\"exportSubtitleInVideo\": [Optional true/false],\"webvttFile\": {\"Kind\": \"[SourceLocaleSubtitle/TargetLocaleSubtitle/MetadataJson]\", \"url\": \"[AzureBlobUrlWithSas]\"}}}" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations/[translationId]/iterations/[iterationId]?api-version=2024-05-20-preview"
112+
```
113+
114+
> [!NOTE]
115+
> When creating an iteration, if you've already specified the optional parameters `speakerCount`, `subtitleMaxCharCountPerSegment`, and `exportSubtitleInVideo` during the creation of translation, you don’t need to specify them again. The values will inherit from translation settings. Once these parameters are defined when creating an iteration, the new values will override the original settings.
116+
>
117+
> The `webvttFile` parameter isn't required when creating the first iteration. However, starting from the second iteration, you must specify the `webvttFile` parameter in the iteration process. You need to download the webvtt file, make necessary edits, and then upload it to your Azure Blob storage. You need to specify the Blob URL in the curl code.
118+
>
119+
> Data created through the API won't appear in Speech Studio, and the data between the API and Speech Studio isn't synchronized.
120+
121+
The subtitle file can be in WebVTT or JSON format. If you're unsure about how to prepare a WebVTT file, refer to the following sample formats.
122+
123+
#### [WebVTT](#tab/webvtt)
124+
125+
```WebVTT
126+
127+
00:00:01.010 --> 00:00:06.030
128+
Hello this is a sample subtitle.
129+
130+
00:00:07.030 --> 00:00:09.030
131+
Hello this is a sample subtitle.
132+
133+
```
134+
135+
#### [WebVTT with JSON Metadata](#tab/webvttwithjsonmetadata)
136+
137+
```WebVTT with JSON Metadata
138+
139+
00:00:01.010 --> 00:00:02.030
140+
{
141+
"globalMetadata": {
142+
"locale": "zh-CN",
143+
"changeTtsVoiceNameIfConflictWithGender": true,
144+
"speakers": {
145+
"PlatformVoice_zh-CN-XiaoxiaoNeural": {
146+
"defaultSsmlProperties": {
147+
"voiceKind": "PlatformVoice",
148+
"voiceName": "zh-CN-XiaoxiaoNeural"
149+
}
150+
},
151+
"PlatformVoice_zh-CN-YunxiNeural": {
152+
"defaultSsmlProperties": {
153+
"voiceKind": "PlatformVoice",
154+
"voiceName": "zh-CN-YunxiNeural"
155+
}
156+
},
157+
"PlatformVoice_ja-JP-NanamiNeural": {
158+
"defaultSsmlProperties": {
159+
"voiceKind": "PlatformVoice",
160+
"voiceName": "ja-JP-NanamiNeural"
161+
}
162+
}
163+
}
164+
},
165+
"speakerId": "PlatformVoice_zh-CN-XiaoxiaoNeural",
166+
"translatedText": "中文晓晓"
167+
}
168+
169+
00:00:02.510 --> 00:00:04.030
170+
{
171+
"gender": "Female",
172+
"speakerId": "PlatformVoice_ja-JP-NanamiNeural",
173+
"translatedText": "こんにちは、これはサンプルの字幕です",
174+
"comment": "Keep using jaJP voice."
175+
}
176+
177+
00:00:05.010 --> 00:00:07.030
178+
{
179+
"gender": "Male",
180+
"speakerId": "PlatformVoice_ja-JP-NanamiNeural",
181+
"translatedText": "こんにちは、これはサンプルの字幕です",
182+
"comment": "Not correct gender, will auto be replaced by enUS auto voice selection."
183+
}
184+
185+
```
186+
---
187+
188+
You should receive a response body in the following format:
189+
190+
```json
191+
{
192+
"input": {
193+
"speakerCount": 1,
194+
"subtitleMaxCharCountPerSegment": 30,
195+
"exportSubtitleInVideo": true
196+
},
197+
"status": "Not Started",
198+
"lastActionDateTime": "2024-09-20T06:31:53.760Z",
199+
"id": "firstiteration0920",
200+
"createdDateTime": "2024-09-20T06:31:53.760Z"
201+
}
202+
```
203+
204+
You can use `operationId` you specified and call the [Get operation by operation ID](#get-operation-by-operation-id) API periodically until the returned status is `Succeeded` or `Failed`. This operation allows you to monitor the progress of your creating the iteration process.
205+
206+
## Get an iteration by iteration ID
207+
208+
To retrieve details of a specific iteration by its ID, use the HTTP GET request. Replace `[YourResourceKey]` with your Speech resource key, `[YourSpeechRegion]` with your Speech resource region, `[translationId]` with the translation ID you want to check, and `[iterationId]` with the iteration ID you want to check.
209+
210+
```azurecli-interactive
211+
curl -v -X GET -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations/[translationId]/iterations/[iterationId]?api-version=2024-05-20-preview"
212+
```
213+
214+
You should receive a response body in the following format:
215+
216+
```json
217+
{
218+
"input": {
219+
"speaker Count": 1,
220+
"subtitleMaxCharCountPerSegment": 30,
221+
"exportSubtitleInVideo": true
222+
},
223+
"result": {
224+
"translatedVideoFileUrl": "https://xxx.blob.core.windows.net/container1/video.mp4?sv=2023-01-03&st=2024-05-20T08%3A27%3A15Z&se=2024-05-21T08%3A27%3A15Z&sr=b&sp=r&sig=xxx",
225+
"sourceLocaleSubtitleWebvttFileUrl": "https://xxx.blob.core.windows.net/container1/sourceLocale.vtt?sv=2023-01-03&st=2024-05-20T08%3A27%3A15Z&se=2024-05-21T08%3A27%3A15Z&sr=b&sp=r&sig=xxx",
226+
"targetLocaleSubtitleWebvttFileUrl": "https://xxx.blob.core.windows.net/container1/targetLocale.vtt?sv=2023-01-03&st=2024-05-20T08%3A27%3A15Z&se=2024-05-21T08%3A27%3A15Z&sr=b&sp=r&sig=xxx",
227+
"metadataJsonWebvttFileUrl": "https://xxx.blob.core.windows.net/container1/metadataJsonLocale.vtt?sv=2023-01-03&st=2024-05-20T08%3A27%3A15Z&se=2024-05-21T08%3A27%3A15Z&sr=b&sp=r&sig=xxx"
228+
},
229+
"status": "Succeeded",
230+
"lastActionDateTime": "2024-09-20T06:32:59.933Z",
231+
"id": "firstiteration0920",
232+
"createdDateTime": "2024-09-20T06:31:53.760Z"
233+
}
234+
```
235+
236+
## Delete a translation by translation ID
237+
238+
Remove a specific translation identified by `translationId`. This operation also removes all iterations associated with this translation. Replace `[YourResourceKey]` with your Speech resource key, `[YourSpeechRegion]` with your Speech resource region, and `[translationId]` with the translation ID you want to delete.
239+
240+
```azurecli-interactive
241+
curl -v -X DELETE -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations/[translationId]?api-version=2024-05-20-preview"
242+
```
243+
244+
The response headers include `HTTP/1.1 204 No Content` if the delete request was successful.
245+
246+
## Additional information
247+
248+
This section provides curl commands for other API calls that aren't described in detail above. You can explore each API using the following commands.
249+
250+
### List translations
251+
252+
To list all video translations that have been uploaded and processed in your resource account, make an HTTP GET request as shown in the following example. Replace `YourResourceKey` with your Speech resource key and replace `YourSpeechRegion` with your Speech resource region.
253+
254+
```azurecli-interactive
255+
curl -v -X GET -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations?api-version=2024-05-20-preview"
256+
```
257+
258+
### Get a translation by translation ID
259+
260+
This operation retrieves detailed information about a specific translation, identified by its unique `translationId`. Replace `[YourResourceKey]` with your Speech resource key, `[YourSpeechRegion]` with your Speech resource region, and `[translationId]` with the translation ID you want to check.
261+
262+
```azurecli-interactive
263+
curl -v -X GET -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations/[translationId]?api-version=2024-05-20-preview"
264+
```
265+
266+
### List iterations
267+
268+
List all iterations for a specific translation. This request lists all iterations without detailed information. Replace `[YourResourceKey]` with your Speech resource key, `[YourSpeechRegion]` with your Speech resource region, and `[translationId]` with the translation ID you want to check.
269+
270+
```azurecli-interactive
271+
curl -v -X GET -H "Ocp-Apim-Subscription-Key: [YourResourceKey]" "https://[YourSpeechRegion].api.cognitive.microsoft.com/videotranslation/translations/[translationId]/iterations?api-version=2024-05-20-preview"
272+
```
273+
274+
## HTTP status codes
275+
276+
The section details the HTTP response codes and messages from the video translation REST API.
277+
278+
### HTTP 200 OK
279+
280+
HTTP 200 OK indicates that the request was successful.
281+
282+
### HTTP 204 error
283+
284+
An HTTP 204 error indicates that the request was successful, but the resource doesn't exist. For example:
285+
286+
- You tried to get or delete a translation that doesn't exist.
287+
- You successfully deleted a translation.
288+
289+
### HTTP 400 error
290+
291+
Here are examples that can result in the 400 error:
292+
293+
- The source or target locale you specified isn't among the [supported locales](language-support.md?tabs=speech-translation#video-translation).
294+
- You tried to use a _F0_ Speech resource, but the region only supports the _Standard_ Speech resource pricing tier.
295+
296+
### HTTP 500 error
297+
298+
HTTP 500 Internal Server Error indicates that the request failed. The response body contains the error message.
299+
300+
## Related content
301+
302+
- [Video translation overview](../../../video-translation-overview.md)

0 commit comments

Comments
 (0)