Update batch-synthesis-avatar.md

sally-baolian · web-flow · commit 0be5fc210971 · 2024-03-12T21:53:48.000+08:00
diff --git a/articles/ai-services/speech-service/text-to-speech-avatar/batch-synthesis-avatar.md b/articles/ai-services/speech-service/text-to-speech-avatar/batch-synthesis-avatar.md
@@ -27,10 +27,10 @@ To perform batch synthesis, you can use the following REST API operations.
 
 | Operation            | Method  | REST API call                                      |
 |----------------------|---------|---------------------------------------------------|
-| [Create batch synthesis](#create-a-batch-synthesis-request) | POST    | texttospeech/3.1-preview1/batchsynthesis/talkingavatar |
-| [Get batch synthesis](#get-batch-synthesis)    | GET     | texttospeech/3.1-preview1/batchsynthesis/talkingavatar/{SynthesisId} |
-| [List batch synthesis](#list-batch-synthesis)   | GET     | texttospeech/3.1-preview1/batchsynthesis/talkingavatar |
-| [Delete batch synthesis](#delete-batch-synthesis) | DELETE  | texttospeech/3.1-preview1/batchsynthesis/talkingavatar/{SynthesisId} |
+| [Create batch synthesis](#create-a-batch-synthesis-request) | PUT    | avatar/batchsyntheses/{SynthesisId}?api-version=2024-04-01-preview |
+| [Get batch synthesis](#get-batch-synthesis)    | GET     | avatar/batchsyntheses/{SynthesisId}?api-version=2024-04-01-preview |
+| [List batch synthesis](#list-batch-synthesis)   | GET     | avatar/batchsyntheses/?api-version=2024-04-01-preview |
+| [Delete batch synthesis](#delete-batch-synthesis) | DELETE  | avatar/batchsyntheses/{SynthesisId}?api-version=2024-04-01-preview |
 
 You can refer to the code samples on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch-avatar).
 
@@ -40,9 +40,9 @@ Some properties in JSON format are required when you create a new batch synthesi
 
 To submit a batch synthesis request, construct the HTTP POST request body following these instructions:
 
-- Set the required `textType` property.
-- If the `textType` property is set to `PlainText`, you must also set the `voice` property in the `synthesisConfig`. In the example below, the `textType` is set to `SSML`, so the `speechSynthesis` isn't set.
-- Set the required `displayName` property. Choose a name for reference, and it doesn't have to be unique.
+- Set the required `inputKind` property.
+- If the `inputKind` property is set to `PlainText`, you must also set the `voice` property in the `synthesisConfig`. In the example below, the `inputKind` is set to `SSML`, so the `speechSynthesis` isn't set.
+- Set the required `SynthesisId` property. Choose a unique `SynthesisId` for the same speech resource. The `SynthesisId` can be a string of 3 to 64 characters, including letters, numbers, '-', or '_', with the condition that it must start and end with a letter or number.
 - Set the required `talkingAvatarCharacter` and `talkingAvatarStyle` properties. You can find supported avatar characters and styles [here](./avatar-gestures-with-ssml.md#supported-pre-built-avatar-characters-styles-and-gestures).
 - Optionally, you can set the `videoFormat`, `backgroundColor`, and other properties. For more information, see [batch synthesis properties](batch-synthesis-avatar-properties.md).
 
@@ -53,47 +53,46 @@ To submit a batch synthesis request, construct the HTTP POST request body follow
 >
 > The maximum length for the output video is currently 20 minutes, with potential increases in the future.
 
-To make an HTTP POST request, use the URI format shown in the following example. Replace `YourSpeechKey` with your Speech resource key, `YourSpeechRegion` with your Speech resource region, and set the request body properties as described above.
+To make an HTTP PUT request, use the URI format shown in the following example. Replace `YourSpeechKey` with your Speech resource key, `YourSpeechRegion` with your Speech resource region, and set the request body properties as described above.
 
 ```azurecli-interactive
-curl -v -X POST -H "Ocp-Apim-Subscription-Key: YourSpeechKey" -H "Content-Type: application/json" -d '{
-    "displayName": "avatar batch synthesis sample",
-    "textType": "SSML",
+curl -v -X PUT -H "Ocp-Apim-Subscription-Key: YourSpeechKey" -H "Content-Type: application/json" -d '{
+    "inputKind": "SSML",
     "inputs": [
         {
-         "text": "<speak version='\''1.0'\'' xml:lang='\''en-US'\''>
-                <voice name='\''en-US-JennyNeural'\''>
-                    The rainbow has seven colors.
-                </voice>
-            </speak>"
+         "content": "<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice name='\''en-US-JennyNeural'\''>The rainbow has seven colors.</voice></speak>"
         }
     ],
-    "properties": {
+    "avatarConfig": {
         "talkingAvatarCharacter": "lisa",
         "talkingAvatarStyle": "graceful-sitting"
     }
-}'  "https://YourSpeechRegion.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/talkingavatar"
+}'  "https://YourSpeechRegion.api.cognitive.microsoft.com/avatar/batchsyntheses/my-job-01?api-version=2024-04-01-preview"
 ```
 
 You should receive a response body in the following format:
 
 ```json
 {
-    "textType": "SSML",
+    "id": "my-job-01",
+    "internalId": "5a25b929-1358-4e81-a036-33000e788c46",
+    "status": "NotStarted",
+    "createdDateTime": "2024-03-06T07:34:08.9487009Z",
+    "lastActionDateTime": "2024-03-06T07:34:08.9487012Z",
+    "inputKind": "SSML",
     "customVoices": {},
     "properties": {
-        "timeToLive": "P31D",
-        "outputFormat": "riff-24khz-16bit-mono-pcm",
+        "timeToLiveInHours": 744,
+    },
+    "avatarConfig": {
         "talkingAvatarCharacter": "lisa",
         "talkingAvatarStyle": "graceful-sitting",
-        "kBitrate": 2000,
+        "videoFormat": "Mp4",
+        "videoCodec": "hevc",
+        "subtitleType": "soft_embedded",
+        "bitrateKbps": 2000,
         "customized": false
-    },
-    "lastActionDateTime": "2023-10-19T12:23:03.348Z",
-    "status": "NotStarted",
-    "id": "c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6",
-    "createdDateTime": "2023-10-19T12:23:03.348Z",
-    "displayName": "avatar batch synthesis sample"
+    }
 }
 ```
 
@@ -107,40 +106,45 @@ To retrieve the status of a batch synthesis job, make an HTTP GET request using
 Replace `YourSynthesisId` with your batch synthesis ID, `YourSpeechKey` with your Speech resource key, and `YourSpeechRegion` with your Speech resource region.
 
 ```azurecli-interactive
-curl -v -X GET "https://YourSpeechRegion.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/talkingavatar/YourSynthesisId" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
+curl -v -X GET "https://YourSpeechRegion.api.cognitive.microsoft.com/avatar/batchsyntheses/YourSynthesisId?api-version=2024-04-01-preview" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
 ```
 
 You should receive a response body in the following format:
 
 ```json
 {
-    "textType": "SSML",
+    "id": "my-job-01",
+    "internalId": "5a25b929-1358-4e81-a036-33000e788c46",
+    "status": "Succeeded",
+    "createdDateTime": "2024-03-06T07:34:08.9487009Z",
+    "lastActionDateTime": "2024-03-06T07:34:12.5698769",
+    "inputKind": "SSML",
     "customVoices": {},
     "properties": {
-        "audioSize": 336780,
-        "durationInTicks": 25200000,
-        "succeededAudioCount": 1,
-        "duration": "PT2.52S",
+        "timeToLiveInHours": 744,
+        "sizeInBytes": 344460,
+        "durationInMilliseconds": 2520,
+        "succeededCount": 1,
+        "failedCount": 0,
         "billingDetails": {
+            "neural": 29,
             "customNeural": 0,
-            "neural": 29
-        },
-        "timeToLive": "P31D",
-        "outputFormat": "riff-24khz-16bit-mono-pcm",
+            "talkingAvatarDurationInSeconds": 2
+        }
+    },
+    "avatarConfig": {
         "talkingAvatarCharacter": "lisa",
         "talkingAvatarStyle": "graceful-sitting",
-        "kBitrate": 2000,
+        "videoFormat": "Mp4",
+        "videoCodec": "hevc",
+        "subtitleType": "soft_embedded",
+        "bitrateKbps": 2000,
         "customized": false
     },
     "outputs": {
-        "result": "https://cvoiceprodwus2.blob.core.windows.net/batch-synthesis-output/c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6/0001.mp4?SAS_Token",
-        "summary": "https://cvoiceprodwus2.blob.core.windows.net/batch-synthesis-output/c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6/summary.json?SAS_Token"
-    },
-    "lastActionDateTime": "2023-10-19T12:23:06.320Z",
-    "status": "Succeeded",
-    "id": "c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6",
-    "createdDateTime": "2023-10-19T12:23:03.350Z",
-    "displayName": "avatar batch synthesis sample"
+        "result": "https://stttssvcprodusw2.blob.core.windows.net/batchsynthesis-output/244a87c294b94ddeb3dbaccee8ffa7eb/5a25b929-1358-4e81-a036-33000e788c46/0001.mp4?SAS_Token",
+        "summary": "https://stttssvcprodusw2.blob.core.windows.net/batchsynthesis-output/244a87c294b94ddeb3dbaccee8ffa7eb/5a25b929-1358-4e81-a036-33000e788c46/summary.json?SAS_Token"
+    }
 }
 ```
 
@@ -151,86 +155,93 @@ From the `outputs.result` field, you can download a video file containing the av
 
 To list all batch synthesis jobs for your Speech resource, make an HTTP GET request using the URI as shown in the following example.
 
-Replace `YourSpeechKey` with your Speech resource key and `YourSpeechRegion` with your Speech resource region. Optionally, you can set the `skip` and `top` (page size) query parameters in the URL. The default value for `skip` is 0, and the default value for `top` is 100.
+Replace `YourSpeechKey` with your Speech resource key and `YourSpeechRegion` with your Speech resource region. Optionally, you can set the `skip` and `top` (page size) query parameters in the URL. The default value for `skip` is 0, and the default value for `maxpagesize` is 100.
 
 ```azurecli-interactive
-curl -v -X GET "https://YourSpeechRegion.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/talkingavatar?skip=0&top=2" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
+curl -v -X GET "https://YourSpeechRegion.api.cognitive.microsoft.com/avatar/batchsyntheses?skip=0&maxpagesize=2&api-version=2024-04-01-preview" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
 ```
 
 You receive a response body in the following format:
 
 ```json
 {
-    "values": [
+    "value": [
         {
-            "textType": "PlainText",
-            "synthesisConfig": {
-                "voice": "en-US-JennyNeural"
-            },
+            "id": "my-job-02",
+            "internalId": "14c25fcf-3cb6-4f46-8810-ecad06d956df",
+            "status": "Succeeded",
+            "createdDateTime": "2024-03-06T07:52:23.9054709Z",
+            "lastActionDateTime": "2024-03-06T07:52:29.3416944",
+            "inputKind": "SSML",
             "customVoices": {},
             "properties": {
-                "audioSize": 339371,
-                "durationInTicks": 25200000,
-                "succeededAudioCount": 1,
-                "duration": "PT2.52S",
+                "timeToLiveInHours": 744,
+                "sizeInBytes": 502676,
+                "durationInMilliseconds": 2950,
+                "succeededCount": 1,
+                "failedCount": 0,
                 "billingDetails": {
+                    "neural": 32,
                     "customNeural": 0,
-                    "neural": 29
-                },
-                "timeToLive": "P31D",
-                "outputFormat": "riff-24khz-16bit-mono-pcm",
+                    "talkingAvatarDurationInSeconds": 2
+                }
+            },
+            "avatarConfig": {
                 "talkingAvatarCharacter": "lisa",
-                "talkingAvatarStyle": "graceful-sitting",
-                "kBitrate": 2000,
+                "talkingAvatarStyle": "casual-sitting",
+                "videoFormat": "Mp4",
+                "videoCodec": "h264",
+                "subtitleType": "soft_embedded",
+                "bitrateKbps": 2000,
                 "customized": false
             },
             "outputs": {
-                "result": "https://cvoiceprodwus2.blob.core.windows.net/batch-synthesis-output/8e3fea5f-4021-4734-8c24-77d3be594633/0001.mp4?SAS_Token",
-                "summary": "https://cvoiceprodwus2.blob.core.windows.net/batch-synthesis-output/8e3fea5f-4021-4734-8c24-77d3be594633/summary.json?SAS_Token"
-            },
-            "lastActionDateTime": "2023-10-19T12:57:45.557Z",
-            "status": "Succeeded",
-            "id": "8e3fea5f-4021-4734-8c24-77d3be594633",
-            "createdDateTime": "2023-10-19T12:57:42.343Z",
-            "displayName": "avatar batch synthesis sample"
+                "result": "https://stttssvcprodusw2.blob.core.windows.net/batchsynthesis-output/244a87c294b94ddeb3dbaccee8ffa7eb/14c25fcf-3cb6-4f46-8810-ecad06d956df/0001.mp4?SAS_Token",
+                "summary": "https://stttssvcprodusw2.blob.core.windows.net/batchsynthesis-output/244a87c294b94ddeb3dbaccee8ffa7eb/14c25fcf-3cb6-4f46-8810-ecad06d956df/summary.json?SAS_Token"
+            }
         },
         {
-            "textType": "SSML",
+            "id": "my-job-01",
+            "internalId": "5a25b929-1358-4e81-a036-33000e788c46",
+            "status": "Succeeded",
+            "createdDateTime": "2024-03-06T07:34:08.9487009Z",
+            "lastActionDateTime": "2024-03-06T07:34:12.5698769",
+            "inputKind": "SSML",
             "customVoices": {},
             "properties": {
-                "audioSize": 336780,
-                "durationInTicks": 25200000,
-                "succeededAudioCount": 1,
-                "duration": "PT2.52S",
+                "timeToLiveInHours": 744,
+                "sizeInBytes": 344460,
+                "durationInMilliseconds": 2520,
+                "succeededCount": 1,
+                "failedCount": 0,
                 "billingDetails": {
+                    "neural": 29,
                     "customNeural": 0,
-                    "neural": 29
-                },
-                "timeToLive": "P31D",
-                "outputFormat": "riff-24khz-16bit-mono-pcm",
+                    "talkingAvatarDurationInSeconds": 2
+                }
+            },
+            "avatarConfig": {
                 "talkingAvatarCharacter": "lisa",
                 "talkingAvatarStyle": "graceful-sitting",
-                "kBitrate": 2000,
+                "videoFormat": "Mp4",
+                "videoCodec": "hevc",
+                "subtitleType": "soft_embedded",
+                "bitrateKbps": 2000,
                 "customized": false
             },
             "outputs": {
-                "result": "https://cvoiceprodwus2.blob.core.windows.net/batch-synthesis-output/c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6/0001.mp4?SAS_Token",
-                "summary": "https://cvoiceprodwus2.blob.core.windows.net/batch-synthesis-output/c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6/summary.json?SAS_Token"
-            },
-            "lastActionDateTime": "2023-10-19T12:23:06.320Z",
-            "status": "Succeeded",
-            "id": "c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6",
-            "createdDateTime": "2023-10-19T12:23:03.350Z",
-            "displayName": "avatar batch synthesis sample"
+                "result": "https://stttssvcprodusw2.blob.core.windows.net/batchsynthesis-output/244a87c294b94ddeb3dbaccee8ffa7eb/5a25b929-1358-4e81-a036-33000e788c46/0001.mp4?SAS_Token",
+                "summary": "https://stttssvcprodusw2.blob.core.windows.net/batchsynthesis-output/244a87c294b94ddeb3dbaccee8ffa7eb/5a25b929-1358-4e81-a036-33000e788c46/summary.json?SAS_Token"
+            }
         }
     ],
-    "@nextLink": "https://{region}.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/talkingavatar?skip=2&top=2"
+    "nextLink": "https://YourSpeechRegion.api.cognitive.microsoft.com/avatar/batchsyntheses/?api-version=2024-04-01-preview&skip=2&maxpagesize=2"
 }
 ```
 
 From `outputs.result`, you can download a video file containing the avatar video. From `outputs.summary`, you can access the summary and debug details. For more information, see [batch synthesis results](#get-batch-synthesis-results-file).
 
-The `values` property in the JSON response lists your synthesis requests. The list is paginated, with a maximum page size of 100. The `@nextLink` property is provided as needed to get the next page of the paginated list.
+The `value` property in the JSON response lists your synthesis requests. The list is paginated, with a maximum page size of 100. The `nextLink` property is provided as needed to get the next page of the paginated list.
 
 ## Get batch synthesis results file
 
@@ -252,34 +263,34 @@ The summary file contains the synthesis results for each text input. Here's an e
 
 ```json
 {
-  "jobID":  "c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6",
-  "status":  "Succeeded",
-  "results":  [
+  "jobID": "5a25b929-1358-4e81-a036-33000e788c46",
+  "status": "Succeeded",
+  "results": [
     {
-      "texts":  [
-        "<speak version='1.0' xml:lang='en-US'>\n\t\t\t\t<voice name='en-US-JennyNeural'>\n\t\t\t\t\tThe rainbow has seven colors.\n\t\t\t\t</voice>\n\t\t\t</speak>"
+      "texts": [
+        "<speak version='1.0' xml:lang='en-US'><voice name='en-US-JennyNeural'>The rainbow has seven colors.</voice></speak>"
       ],
-      "status":  "Succeeded",
-      "billingDetails":  {
-        "Neural":  "29",
-        "TalkingAvatarDuration":  "2"
+      "status": "Succeeded",
+      "billingDetails": {
+        "Neural": "29",
+        "TalkingAvatarDuration": "2"
       },
-      "videoFileName":  "c48b4cf5-957f-4a0f-96af-a4e3e71bd6b6/0001.mp4",
-      "TalkingAvatarCharacter":  "lisa",
-      "TalkingAvatarStyle":  "graceful-sitting"
+      "videoFileName": "244a87c294b94ddeb3dbaccee8ffa7eb/5a25b929-1358-4e81-a036-33000e788c46/0001.mp4",
+      "TalkingAvatarCharacter": "lisa",
+      "TalkingAvatarStyle": "graceful-sitting"
     }
   ]
 }
 ```
 
 ## Delete batch synthesis
 
-After you have retrieved the audio output results and no longer need the batch synthesis job history, you can delete it. The Speech service retains each synthesis history for up to 31 days or the duration specified by the request's `timeToLive` property, whichever comes sooner. The date and time of automatic deletion, for synthesis jobs with a status of "Succeeded" or "Failed" is calculated as the sum of the `lastActionDateTime` and `timeToLive` properties.
+After you have retrieved the audio output results and no longer need the batch synthesis job history, you can delete it. The Speech service retains each synthesis history for up to 31 days or the duration specified by the request's `timeToLiveInHours` property, whichever comes sooner. The date and time of automatic deletion, for synthesis jobs with a status of "Succeeded" or "Failed" is calculated as the sum of the `lastActionDateTime` and `timeToLive` properties.
 
 To delete a batch synthesis job, make an HTTP DELETE request using the following URI format. Replace `YourSynthesisId` with your batch synthesis ID, `YourSpeechKey` with your Speech resource key, and `YourSpeechRegion` with your Speech resource region.
 
 ```azurecli-interactive
-curl -v -X DELETE "https://YourSpeechRegion.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/talkingavatar/YourSynthesisId" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
+curl -v -X DELETE "https://YourSpeechRegion.api.cognitive.microsoft.com/avatar/batchsyntheses/YourSynthesisId?api-version=2024-04-01-preview" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
 ```
 
 The response headers include `HTTP/1.1 204 No Content` if the delete request was successful.