Skip to content

Commit f02357a

Browse files
authored
Merge pull request #281049 from MicrosoftDocs/main
7/17 11:00 AM IST Publish
2 parents 45e626a + a14ffa9 commit f02357a

File tree

26 files changed

+339
-221
lines changed

26 files changed

+339
-221
lines changed

articles/ai-services/speech-service/fast-transcription-create.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ ms.date: 7/12/2024
2121
Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
2222

2323
- Quick audio or video transcription, subtitles, and edit.
24-
- Video dubbing
24+
- Video translation
2525

2626
> [!TIP]
2727
> Try out fast transcription in [Azure AI Studio](https://aka.ms/fasttranscription/studio).
@@ -261,5 +261,6 @@ The response will include `duration`, `channel`, and more. The `combinedPhrases`
261261

262262
## Related content
263263

264-
- [Speech to text quickstart](./get-started-speech-to-text.md)
265-
- [Batch transcription API](./batch-transcription.md)
264+
- [Fast transcription REST API reference](/rest/api/speechtotext/transcriptions/transcribe)
265+
- [Speech to text supported languages](./language-support.md?tabs=stt)
266+
- [Batch transcription](./batch-transcription.md)

articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -797,7 +797,7 @@ Visit the [Audio Content Creation tool](https://speech.microsoft.com/audioconten
797797

798798
#### New features
799799
- Jenny supports a new `newscast` style. See [how to use the speaking styles in SSML](../../speech-synthesis-markup-voice.md#use-speaking-styles-and-roles).
800-
- **Neural voices upgraded to HiFiNet vocoder, with higher audio fidelity and faster synthesis speed**. This benefits customers whose scenario relies on hi-fi audio or long interactions, including video dubbing, audio books, or online education materials. [Read more about the story and hear the voice samples on our tech community blog](https://techcommunity.microsoft.com/t5/azure-ai/azure-neural-tts-upgraded-with-hifinet-achieving-higher-audio/ba-p/1847860)
800+
- **Neural voices upgraded to HiFiNet vocoder, with higher audio fidelity and faster synthesis speed**. This benefits customers whose scenario relies on hi-fi audio or long interactions, including video translation, audio books, or online education materials. [Read more about the story and hear the voice samples on our tech community blog](https://techcommunity.microsoft.com/t5/azure-ai/azure-neural-tts-upgraded-with-hifinet-achieving-higher-audio/ba-p/1847860)
801801
- **[Custom voice](https://speech.microsoft.com/customvoice) & [Audio Content Creation Studio](https://speech.microsoft.com/audiocontentcreation) localized to 17 locales**. Users can easily switch the UI to a local language for a more friendly experience.
802802
- **Audio Content Creation**: Added style degree control for XiaoxiaoNeural; Refined the customized break feature to include incremental breaks of 50ms.
803803

articles/ai-services/speech-service/language-support.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,10 @@ To improve Speech to text recognition accuracy, customization is available for s
4747

4848
These are the locales that support the [display text format feature](./how-to-custom-speech-display-text-format.md): da-DK, de-DE, en-AU, en-CA, en-GB, en-HK, en-IE, en-IN, en-NG, en-NZ, en-PH, en-SG, en-US, es-ES, es-MX, fi-FI, fr-CA, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, nb-NO, nl-NL, pl-PL, pt-BR, pt-PT, sv-SE, tr-TR, zh-CN, zh-HK.
4949

50+
### Fast transcription
51+
52+
The supported locales for the [fast transcription API](fast-transcription-create.md) are: en-US, es-ES, es-MX, fr-FR, hi-IN, it-IT, ja-JP, ko-KR, pt-BR, and zh-CN. You can only specify one locale per transcription request.
53+
5054
# [Text to speech](#tab/tts)
5155

5256
The table in this section summarizes the locales and voices supported for Text to speech. See the table footnotes for more details.

articles/ai-services/speech-service/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ With [real-time speech to text](get-started-speech-to-text.md), the audio is tra
6363
Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
6464

6565
- Quick audio or video transcription, subtitles, and edit.
66-
- Video dubbing
66+
- Video translation
6767

6868
> [!NOTE]
6969
> Fast transcription API is only available via the speech to text REST API version 3.3.

articles/ai-services/speech-service/rest-speech-to-text.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,17 +21,18 @@ Speech to text REST API is used for [batch transcription](batch-transcription.md
2121
> Speech to text REST API v3.0 will be retired on April 1st, 2026. For more information about upgrading, see the Speech to text REST API [v3.0 to v3.1](migrate-v3-0-to-v3-1.md) and [v3.1 to v3.2](migrate-v3-1-to-v3-2.md) migration guides.
2222
2323
> [!div class="nextstepaction"]
24-
> [See the Speech to text REST API v3.2 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.2&preserve-view=true)
24+
> [See the Speech to text REST API 2024-05-15 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-2024-05-15-preview&preserve-view=true)
2525
2626
> [!div class="nextstepaction"]
27-
> [See the Speech to text REST API v3.1 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.1&preserve-view=true)
27+
> [See the Speech to text REST API v3.2 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.2&preserve-view=true)
2828
2929
> [!div class="nextstepaction"]
30-
> [See the Speech to text REST API v3.0 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.0&preserve-view=true)
30+
> [See the Speech to text REST API v3.1 reference documentation](/rest/api/speechtotext/operation-groups?view=rest-speechtotext-v3.1&preserve-view=true)
3131
3232
Use Speech to text REST API to:
3333

34-
- [Custom speech](custom-speech-overview.md): With custom speech, you can upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint. Copy models to other subscriptions if you want colleagues to have access to a model that you built, or if you want to deploy a model to more than one region.
34+
- [Fast transcription](fast-transcription-create.md): Transcribe audio files with returning results synchronously and much faster than real-time audio. Use the fast transcription API ([/speechtotext/transcriptions:transcribe](/rest/api/speechtotext/transcriptions/transcribe)) in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as quick audio or video transcription or video translation.
35+
- [Custom speech](custom-speech-overview.md): Upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint. Copy models to other subscriptions if you want colleagues to have access to a model that you built, or if you want to deploy a model to more than one region.
3536
- [Batch transcription](batch-transcription.md): Transcribe audio files as a batch from multiple URLs or an Azure container.
3637

3738
Speech to text REST API includes such features as:

articles/ai-services/speech-service/speech-synthesis-markup-voice.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ The following table describes each supported `style` attribute:
118118
|`style="customerservice"`|Expresses a friendly and helpful tone for customer support.|
119119
|`style="depressed"`|Expresses a melancholic and despondent tone with lower pitch and energy.|
120120
|`style="disgruntled"`|Expresses a disdainful and complaining tone. Speech of this emotion displays displeasure and contempt.|
121-
|`style="documentary-narration"`|Narrates documentaries in a relaxed, interested, and informative style suitable for dubbing documentaries, expert commentary, and similar content.|
121+
|`style="documentary-narration"`|Narrates documentaries in a relaxed, interested, and informative style suitable for documentaries, expert commentary, and similar content.|
122122
|`style="embarrassed"`|Expresses an uncertain and hesitant tone when the speaker is feeling uncomfortable.|
123123
|`style="empathetic"`|Expresses a sense of caring and understanding.|
124124
|`style="envious"`|Expresses a tone of admiration when you desire something that someone else has.|

articles/ai-services/speech-service/speech-to-text.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Real-time speech to text is available via the [Speech SDK](speech-sdk.md) and th
3636
Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
3737

3838
- Quick audio or video transcription, subtitles, and edit.
39-
- Video dubbing
39+
- Video translation
4040

4141
> [!NOTE]
4242
> Fast transcription API is only available via the speech to text REST API version 2024-05-15-preview and later.

articles/ai-services/speech-service/video-translation-overview.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ ms.custom: references_regions
1818

1919
Video translation is a feature in Azure AI Speech that enables you to seamlessly translate and generate videos in multiple languages automatically. This feature is designed to help you localize your video content to cater to diverse audiences around the globe. You can efficiently create immersive, localized videos across various use cases such as vlogs, education, news, enterprise training, advertising, film, TV shows, and more.
2020

21-
The process of replacing the original language of a video with audio recorded in a different language is often relied upon to cater to diverse audiences. Traditionally achieved through human recording and manual post-production, dubbing is essential for ensuring that viewers can enjoy video content in their native language. However, this process comes with key pain points, including its high cost, lengthy duration, and inability to replicate the original speaker's voice accurately. Video translation in Azure AI Speech addresses these challenges by providing an automated, efficient, and cost-effective solution for creating localized videos.
21+
The process of replacing the original language of a video with audio recorded in a different language is often relied upon to cater to diverse audiences. Traditionally achieved through human recording and manual post-production, translation is essential for ensuring that viewers can enjoy video content in their native language. However, this process comes with key pain points, including its high cost, lengthy duration, and inability to replicate the original speaker's voice accurately. Video translation in Azure AI Speech addresses these challenges by providing an automated, efficient, and cost-effective solution for creating localized videos.
2222

2323
## Use case
2424

@@ -50,9 +50,9 @@ We support video translation between various languages, enabling you to tailor y
5050
- **Translation from language A to B and large language model (LLM) reformulation.**
5151

5252
Translates the transcribed content from the original language (Language A) to the target language (Language B) using advanced language processing techniques. Enhances translation quality and refines gender-aware translated text through LLM reformulation.
53-
- **Automatic dubbing – voice generation in other language.**
53+
- **Automatic translation – voice generation in other language.**
5454

55-
Utilizes AI-powered text-to-speech technology to automatically generate human-like voices in the target language. These voices are precisely synchronized with the video, ensuring a flawless dubbing experience. This includes utilizing prebuilt neural voices for high-quality output and offering options for personal voice.
55+
Utilizes AI-powered text-to-speech technology to automatically generate human-like voices in the target language. These voices are precisely synchronized with the video, ensuring a flawless translation experience. This includes utilizing prebuilt neural voices for high-quality output and offering options for personal voice.
5656
- **Human in the loop for content editing.**
5757

5858
Allows for human intervention to review and edit the translated content, ensuring accuracy and cultural appropriateness before finalizing the dubbed video.

articles/ai-studio/toc.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,6 @@ items:
2222
href: quickstarts/get-started-code.md
2323
- name: Get started with Assistants and code interpreter in the playground
2424
href: ../ai-services/openai/assistants-quickstart.md?context=/azure/ai-studio/context/context
25-
- name: Hear and speak with chat in the playground
26-
href: quickstarts/hear-speak-playground.md
2725
- name: Moderate text and images with Content Safety
2826
href: quickstarts/content-safety.md
2927
- name: Use Azure AI Studio with a screen reader

articles/aks/TOC.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -943,6 +943,8 @@
943943
href: eks-edw-prepare.md
944944
- name: Deploy to Azure
945945
href: eks-edw-deploy.md
946+
- name: FAQ
947+
href: faq.md
946948
- name: Reference
947949
items:
948950
- name: Azure CLI
@@ -982,8 +984,6 @@
982984
href: https://stackoverflow.com/questions/tagged/azure-container-service
983985
- name: Videos
984986
href: https://azure.microsoft.com/resources/videos/index/?services=container-service&sort=newest
985-
- name: FAQ
986-
href: faq.md
987987
- name: Support and troubleshooting
988988
items:
989989
- name: Support options for AKS

0 commit comments

Comments
 (0)