Skip to content

Commit a3d32cf

Browse files
authored
Merge pull request #218695 from eric-urban/eur/ssml-silence
batch synthesis preview
2 parents 97776d8 + c00c6cc commit a3d32cf

22 files changed

+574
-513
lines changed

articles/cognitive-services/.openpublishing.redirection.cognitive-services.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4990,6 +4990,11 @@
49904990
"redirect_url": "/azure/cognitive-services/speech-service/migrate-v3-0-to-v3-1",
49914991
"redirect_document_id": true
49924992
},
4993+
{
4994+
"source_path_from_root": "/articles/cognitive-services/speech-service/long-audio-api.md",
4995+
"redirect_url": "/azure/cognitive-services/speech-service/batch-synthesis",
4996+
"redirect_document_id": false
4997+
},
49934998
{
49944999
"source_path_from_root": "/articles/cognitive-services/text-analytics/concepts/data-limits.md",
49955000
"redirect_url": "/azure/cognitive-services/language-service/overview",

articles/cognitive-services/Speech-Service/batch-synthesis.md

Lines changed: 443 additions & 0 deletions
Large diffs are not rendered by default.

articles/cognitive-services/Speech-Service/batch-transcription-create.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,6 @@ curl -v -X POST -H "Ocp-Apim-Subscription-Key: YourSubscriptionKey" -H "Content-
100100
}' "https://YourServiceRegion.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions"
101101
```
102102

103-
104103
You should receive a response body in the following format:
105104

106105
```json
@@ -166,7 +165,7 @@ Here are some property options that you can use to configure a transcription whe
166165
|`model`|You can set the `model` property to use a specific base model or [Custom Speech](how-to-custom-speech-train-model.md) model. If you don't specify the `model`, the default base model for the locale is used. For more information, see [Using custom models](#using-custom-models).|
167166
|`profanityFilterMode`|Specifies how to handle profanity in recognition results. Accepted values are `None` to disable profanity filtering, `Masked` to replace profanity with asterisks, `Removed` to remove all profanity from the result, or `Tags` to add profanity tags. The default value is `Masked`. |
168167
|`punctuationMode`|Specifies how to handle punctuation in recognition results. Accepted values are `None` to disable punctuation, `Dictated` to imply explicit (spoken) punctuation, `Automatic` to let the decoder deal with punctuation, or `DictatedAndAutomatic` to use dictated and automatic punctuation. The default value is `DictatedAndAutomatic`.|
169-
|`timeToLive`|A duration after the transcription job is created, when the transcription results will be automatically deleted. For example, specify `PT12H` for 12 hours. As an alternative, you can call [DeleteTranscription](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0/operations/DeleteTranscription) regularly after you retrieve the transcription results.|
168+
|`timeToLive`|A duration after the transcription job is created, when the transcription results will be automatically deleted. The value is an ISO 8601 encoded duration. For example, specify `PT12H` for 12 hours. As an alternative, you can call [DeleteTranscription](https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0/operations/DeleteTranscription) regularly after you retrieve the transcription results.|
170169
|`wordLevelTimestampsEnabled`|Specifies if word level timestamps should be included in the output. The default value is `false`.|
171170

172171

articles/cognitive-services/Speech-Service/batch-transcription-get.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -345,20 +345,20 @@ Depending in part on the request parameters set when you created the transcripti
345345
|`confidence`|The confidence value for the recognition.|
346346
|`display`|The display form of the recognized text. Added punctuation and capitalization are included.|
347347
|`displayPhraseElements`|A list of results with display text for each word of the phrase. The `displayFormWordLevelTimestampsEnabled` request property must be set to `true`, otherwise this property is not present.<br/><br/>**Note**: This property is only available with speech-to-text REST API version 3.1.|
348-
|`duration`|The audio duration, ISO 8601 encoded duration.|
348+
|`duration`|The audio duration. The value is an ISO 8601 encoded duration.|
349349
|`durationInTicks`|The audio duration in ticks (1 tick is 100 nanoseconds).|
350350
|`itn`|The inverse text normalized (ITN) form of the recognized text. Abbreviations such as "Doctor Smith" to "Dr Smith", phone numbers, and other transformations are applied.|
351351
|`lexical`|The actual words recognized.|
352352
|`locale`|The locale identified from the input the audio. The `languageIdentification` request property must be set to `true`, otherwise this property is not present.<br/><br/>**Note**: This property is only available with speech-to-text REST API version 3.1.|
353353
|`maskedITN`|The ITN form with profanity masking applied.|
354354
|`nBest`|A list of possible transcriptions for the current phrase with confidences.|
355-
|`offset`|The offset in audio of this phrase, ISO 8601 encoded duration.|
355+
|`offset`|The offset in audio of this phrase. The value is an ISO 8601 encoded duration.|
356356
|`offsetInTicks`|The offset in audio of this phrase in ticks (1 tick is 100 nanoseconds).|
357357
|`recognitionStatus`|The recognition state. For example: "Success" or "Failure".|
358358
|`recognizedPhrases`|The list of results for each phrase.|
359359
|`source`|The URL that was provided as the input audio source. The source corresponds to the `contentUrls` or `contentContainerUrl` request property. The `source` property is the only way to confirm the audio input for a transcription.|
360360
|`speaker`|The identified speaker. The `diarization` and `diarizationEnabled` request properties must be set, otherwise this property is not present.|
361-
|`timestamp`|The creation time of the transcription, ISO 8601 encoded timestamp, combined date and time.|
361+
|`timestamp`|The creation date and time of the transcription. The value is an ISO 8601 encoded timestamp.|
362362
|`words`|A list of results with lexical text for each word of the phrase. The `wordLevelTimestampsEnabled` request property must be set to `true`, otherwise this property is not present.|
363363

364364

articles/cognitive-services/Speech-Service/how-to-audio-content-creation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -215,5 +215,5 @@ If you want to allow a user to grant access to other users, you need to assign t
215215
216216
## Next steps
217217
218-
* [Long Audio API](./long-audio-api.md)
219-
218+
- [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md)
219+
- [Batch synthesis](batch-synthesis.md)

articles/cognitive-services/Speech-Service/how-to-custom-voice-create-voice.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -227,4 +227,3 @@ Navigate to the project where you copied the model to [deploy the model copy](ho
227227
- [Deploy and use your voice model](how-to-deploy-and-use-endpoint.md)
228228
- [How to record voice samples](record-custom-voice-samples.md)
229229
- [Text-to-Speech API reference](rest-text-to-speech.md)
230-
- [Long Audio API](long-audio-api.md)

articles/cognitive-services/Speech-Service/how-to-deploy-and-use-endpoint.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -318,4 +318,4 @@ The HTTP status code for each response indicates success or common errors.
318318

319319
- [How to record voice samples](record-custom-voice-samples.md)
320320
- [Text-to-Speech API reference](rest-text-to-speech.md)
321-
- [Long Audio API](long-audio-api.md)
321+
- [Batch synthesis](batch-synthesis.md)

articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/cpp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ Now that you've completed the quickstart, here are some additional consideration
139139

140140
This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
141141
- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
142-
- For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md).
142+
- For information about synthesizing long-form text to speech, see [batch synthesis](~/articles/cognitive-services/speech-service/batch-synthesis.md).
143143

144144
## Clean up resources
145145

articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/csharp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ Now that you've completed the quickstart, here are some additional consideration
129129

130130
This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
131131
- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
132-
- For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md).
132+
- For information about synthesizing long-form text to speech, see [batch synthesis](~/articles/cognitive-services/speech-service/batch-synthesis.md).
133133

134134
## Clean up resources
135135

articles/cognitive-services/Speech-Service/includes/quickstarts/text-to-speech-basics/java.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ Now that you've completed the quickstart, here are some additional consideration
155155

156156
This quickstart uses the `SpeakTextAsync` operation to synthesize a short block of text that you enter. You can also get text from files as described in these guides:
157157
- For information about speech synthesis from a file and finer control over voice styles, prosody, and other settings, see [How to synthesize speech](~/articles/cognitive-services/speech-service/how-to-speech-synthesis.md) and [Improve synthesis with Speech Synthesis Markup Language (SSML)](~/articles/cognitive-services/speech-service/speech-synthesis-markup.md).
158-
- For information about batch synthesis, see [Synthesize long-form text to speech](~/articles/cognitive-services/speech-service/long-audio-api.md).
158+
- For information about synthesizing long-form text to speech, see [batch synthesis](~/articles/cognitive-services/speech-service/batch-synthesis.md).
159159

160160
## Clean up resources
161161

0 commit comments

Comments
 (0)