Skip to content

Commit bc5af01

Browse files
authored
Merge pull request #208650 from eric-urban/eur/combine-locales
consolidate locale tables
2 parents ba542a3 + 99dc948 commit bc5af01

File tree

75 files changed

+640
-1353
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+640
-1353
lines changed

articles/cognitive-services/Speech-Service/call-center-transcription.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ Internally, Microsoft uses these technologies to support Microsoft customer call
113113

114114
Some businesses are required to transcribe conversations in real time. You can use real-time transcription to identify keywords and trigger searches for content and resources that are relevant to the conversation, to monitor sentiment, to improve accessibility, or to provide translations for customers and agents who aren't native speakers.
115115

116-
For scenarios that require real-time transcription, we recommend using the [Speech SDK](speech-sdk.md). Currently, speech-to-text is available in [more than 20 languages](language-support.md), and the SDK is available in C++, C#, Java, Python, JavaScript, Objective-C, and Go. Samples are available in each language on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk). For the latest news and updates, see [Release notes](releasenotes.md).
116+
For scenarios that require real-time transcription, we recommend using the [Speech SDK](speech-sdk.md). The SDK is available in C++, C#, Java, Python, JavaScript, Objective-C, and Go. Samples are available in each language on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk). For the latest news and updates, see [Release notes](releasenotes.md).
117117

118118
Internally, Microsoft uses the previously mentioned technologies to analyze Microsoft customer calls in real time, as shown in the following diagram:
119119

articles/cognitive-services/Speech-Service/captioning-concepts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ Profanity filter is applied to the result `Text` and `MaskedNormalizedForm` prop
225225

226226
## Language identification
227227

228-
If the language in the audio could change, use continuous [language identification](language-identification.md). Language identification is used to identify languages spoken in audio when compared against a list of [supported languages](language-support.md#language-identification). You provide up to 10 candidate languages, at least one of which is expected be in the audio. The Speech service returns the most likely language in the audio.
228+
If the language in the audio could change, use continuous [language identification](language-identification.md). Language identification is used to identify languages spoken in audio when compared against a list of [supported languages](language-support.md?tabs=language-identification). You provide up to 10 candidate languages, at least one of which is expected be in the audio. The Speech service returns the most likely language in the audio.
229229

230230
## Customizations to improve accuracy
231231

articles/cognitive-services/Speech-Service/conversation-transcription.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ Audio data is processed live to return the speaker identifier and transcript, an
8282

8383
## Language support
8484

85-
Currently, conversation transcription supports [all speech-to-text languages](language-support.md#speech-to-text) in the following regions: `centralus`, `eastasia`, `eastus`, `westeurope`.
85+
Currently, conversation transcription supports [all speech-to-text languages](language-support.md?tabs=stt-tts) in the following regions: `centralus`, `eastasia`, `eastus`, `westeurope`.
8686

8787
## Next steps
8888

articles/cognitive-services/Speech-Service/custom-neural-voice.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ ms.author: eur
1616

1717
Custom Neural Voice is a text-to-speech feature that lets you create a one-of-a-kind, customized, synthetic voice for your applications. With Custom Neural Voice, you can build a highly natural-sounding voice by providing your audio samples as training data. If you're looking for ready-to-use options, check out our [text-to-speech](text-to-speech.md) service.
1818

19-
Based on the neural text-to-speech technology and the multilingual, multi-speaker, universal model, Custom Neural Voice lets you create synthetic voices that are rich in speaking styles, or adaptable cross languages. The realistic and natural sounding voice of Custom Neural Voice can represent brands, personify machines, and allow users to interact with applications conversationally. See the [supported languages](language-support.md#custom-neural-voice) for Custom Neural Voice.
19+
Based on the neural text-to-speech technology and the multilingual, multi-speaker, universal model, Custom Neural Voice lets you create synthetic voices that are rich in speaking styles, or adaptable cross languages. The realistic and natural sounding voice of Custom Neural Voice can represent brands, personify machines, and allow users to interact with applications conversationally. See the [supported languages](language-support.md?tabs=stt-tts) for Custom Neural Voice.
2020

2121
> [!IMPORTANT]
2222
> Custom Neural Voice access is limited based on eligibility and usage criteria. Request access on the [intake form](https://aka.ms/customneural).

articles/cognitive-services/Speech-Service/custom-speech-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ ms.custom: contperf-fy21q2, references_regions
1717

1818
With Custom Speech, you can evaluate and improve the Microsoft speech-to-text accuracy for your applications and products.
1919

20-
Out of the box, speech to text utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing a variety of common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md) is used by default. The base model works very well in most speech recognition scenarios.
20+
Out of the box, speech to text utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing a variety of common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md?tabs=stt-tts) is used by default. The base model works very well in most speech recognition scenarios.
2121

2222
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions.
2323

articles/cognitive-services/Speech-Service/direct-line-speech.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ Direct Line Speech is a robust, end-to-end solution for creating a flexible, ext
2121

2222
Direct Line Speech offers the highest levels of customization and sophistication for voice assistants. It's designed for conversational scenarios that are open-ended, natural, or hybrids of the two with task completion or command-and-control use. This high degree of flexibility comes with a greater complexity, and scenarios that are scoped to well-defined tasks using natural language input may want to consider [Custom Commands](custom-commands.md) for a streamlined solution experience.
2323

24+
Direct Line Speech supports these locales: `ar-eg`, `ar-sa`, `ca-es`, `da-dk`, `de-de`, `en-au`, `en-ca`, `en-gb`, `en-in`, `en-nz`, `en-us`, `es-es`, `es-mx`, `fi-fi`, `fr-ca`, `fr-fr`, `gu-in`, `hi-in`, `hu-hu`, `it-it`, `ja-jp`, `ko-kr`, `mr-in`, `nb-no`, `nl-nl`, `pl-pl`, `pt-br`, `pt-pt`, `ru-ru`, `sv-se`, `ta-in`, `te-in`, `th-th`, `tr-tr`, `zh-cn`, `zh-hk`, and `zh-tw`.
25+
2426
## Getting started with Direct Line Speech
2527

2628
To create a voice assistant using Direct Line Speech, create a Speech resource and Azure Bot resource in the [Azure portal](https://portal.azure.com). Then [connect the bot](/azure/bot-service/bot-service-channel-connect-directlinespeech) to the Direct Line Speech channel.
@@ -49,7 +51,7 @@ Sample code for creating a voice assistant is available on GitHub. These samples
4951
Voice assistants built using Speech service can use the full range of customization options available for [speech-to-text](speech-to-text.md), [text-to-speech](text-to-speech.md), and [custom keyword selection](./custom-keyword-basics.md).
5052

5153
> [!NOTE]
52-
> Customization options vary by language/locale (see [Supported languages](./language-support.md)).
54+
> Customization options vary by language/locale (see [Supported languages](./language-support.md?tabs=stt-tts)).
5355
5456
Direct Line Speech and its associated functionality for voice assistants are an ideal supplement to the [Virtual Assistant Solution and Enterprise Template](/azure/bot-service/bot-builder-enterprise-template-overview). Though Direct Line Speech can work with any compatible bot, these resources provide a reusable baseline for high-quality conversational experiences as well as common supporting skills and models to get started quickly.
5557

articles/cognitive-services/Speech-Service/how-to-audio-content-creation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ ms.author: eur
1818

1919
The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md). It allows you to adjust text-to-speech output attributes in real time or batch synthesis, such as voice characters, voice styles, speaking speed, pronunciation, and prosody.
2020

21-
You have easy access to a broad portfolio of [languages and voices](language-support.md#text-to-speech). These voices include state-of-the-art prebuilt neural voices and your custom neural voice, if you've built one.
21+
You have easy access to a broad portfolio of [languages and voices](language-support.md?tabs=stt-tts). These voices include state-of-the-art prebuilt neural voices and your custom neural voice, if you've built one.
2222

2323
To learn more, view the [Audio Content Creation tutorial video](https://youtu.be/ygApYuOOG6w).
2424

@@ -70,7 +70,7 @@ Each step in the preceding diagram is described here:
7070
1. Choose the Speech resource you want to work with.
7171

7272
1. [Create an audio tuning file](#create-an-audio-tuning-file) by using plain text or SSML scripts. Enter or upload your content into Audio Content Creation.
73-
1. Choose the voice and the language for your script content. Audio Content Creation includes all of the [Microsoft text-to-speech voices](language-support.md#text-to-speech). You can use prebuilt neural voices or a custom neural voice.
73+
1. Choose the voice and the language for your script content. Audio Content Creation includes all of the [prebuilt text-to-speech voices](language-support.md?tabs=stt-tts). You can use prebuilt neural voices or a custom neural voice.
7474

7575
> [!NOTE]
7676
> Gated access is available for Custom Neural Voice, which allows you to create high-definition voices that are similar to natural-sounding speech. For more information, see [Gating process](./text-to-speech.md).

articles/cognitive-services/Speech-Service/how-to-custom-speech-create-project.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ zone_pivot_groups: speech-studio-cli-rest
1515

1616
# Create a Custom Speech project
1717

18-
Custom Speech projects contain models, training and testing datasets, and deployment endpoints. Each project is specific to a [locale](language-support.md). For example, you might create a project for English in the United States.
18+
Custom Speech projects contain models, training and testing datasets, and deployment endpoints. Each project is specific to a [locale](language-support.md?tabs=stt-tts). For example, you might create a project for English in the United States.
1919

2020
## Create a project
2121

articles/cognitive-services/Speech-Service/how-to-custom-speech-model-and-endpoint-lifecycle.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ When a custom model or base model expires, it is no longer available for transcr
4242

4343
|Transcription route |Expired model result |Recommendation |
4444
|---------|---------|---------|
45-
|Custom endpoint|Speech recognition requests will fall back to the most recent base model for the same [locale](language-support.md). You will get results, but recognition might not accurately transcribe your domain data. |Update the endpoint's model as described in the [Deploy a Custom Speech model](how-to-custom-speech-deploy-model.md) guide. |
45+
|Custom endpoint|Speech recognition requests will fall back to the most recent base model for the same [locale](language-support.md?tabs=stt-tts). You will get results, but recognition might not accurately transcribe your domain data. |Update the endpoint's model as described in the [Deploy a Custom Speech model](how-to-custom-speech-deploy-model.md) guide. |
4646
|Batch transcription |[Batch transcription](batch-transcription.md) requests for expired models will fail with a 4xx error. |In each [CreateTranscription](https://westus2.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0/operations/CreateTranscription) REST API request body, set the `model` property to a base model or custom model that hasn't yet expired. Otherwise don't include the `model` property to always use the latest base model. |
4747

4848

articles/cognitive-services/Speech-Service/how-to-custom-speech-test-and-train.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ You can use audio + human-labeled transcript data for both [training](how-to-cus
7373
- To improve the acoustic aspects like slight accents, speaking styles, and background noises.
7474
- To measure the accuracy of Microsoft's speech-to-text accuracy when it's processing your audio files.
7575

76-
For a list of base models that support training with audio data, see [Language support](language-support.md#speech-to-text). Even if a base model does support training with audio data, the service might use only part of the audio. And it will still use all the transcripts.
76+
For a list of base models that support training with audio data, see [Language support](language-support.md?tabs=stt-tts). Even if a base model does support training with audio data, the service might use only part of the audio. And it will still use all the transcripts.
7777

7878
> [!IMPORTANT]
7979
> If a base model doesn't support customization with audio data, only the transcription text will be used for training. If you switch to a base model that supports customization with audio data, the training time may increase from several hours to several days. The change in training time would be most noticeable when you switch to a base model in a [region](regions.md#speech-service) without dedicated hardware for training. If the audio data is not required, you should remove it to decrease the training time.
@@ -143,7 +143,7 @@ Expected utterances often follow a certain pattern. One common pattern is that u
143143
* "I have a question about `product`," where `product` is a list of possible products.
144144
* "Make that `object` `color`," where `object` is a list of geometric shapes and `color` is a list of colors.
145145

146-
For a list of supported base models and locales for training with structured text, see [Language support](language-support.md#speech-to-text). You must use the latest base model for these locales. For locales that don't support training with structured text, the service will take any training sentences that don't reference any classes as part of training with plain-text data.
146+
For a list of supported base models and locales for training with structured text, see [Language support](language-support.md?tabs=stt-tts). You must use the latest base model for these locales. For locales that don't support training with structured text, the service will take any training sentences that don't reference any classes as part of training with plain-text data.
147147

148148
The structured-text file should have an .md extension. The maximum file size is 200 MB, and the text encoding must be UTF-8 BOM. The syntax of the Markdown is the same as that from the Language Understanding models, in particular list entities and example utterances. For more information about the complete Markdown syntax, see the <a href="/azure/bot-service/file-format/bot-builder-lu-file-format" target="_blank"> Language Understanding Markdown</a>.
149149

@@ -202,7 +202,7 @@ Here's an example structured text file:
202202

203203
Specialized or made up words might have unique pronunciations. These words can be recognized if they can be broken down into smaller words to pronounce them. For example, to recognize "Xbox", pronounce it as "X box". This approach won't increase overall accuracy, but can improve recognition of this and other keywords.
204204

205-
You can provide a custom pronunciation file to improve recognition. Don't use custom pronunciation files to alter the pronunciation of common words. For a list of languages that support custom pronunciation, see [language support](language-support.md#speech-to-text).
205+
You can provide a custom pronunciation file to improve recognition. Don't use custom pronunciation files to alter the pronunciation of common words. For a list of languages that support custom pronunciation, see [language support](language-support.md?tabs=stt-tts).
206206

207207
> [!NOTE]
208208
> You can either use a pronunciation data file on its own, or you can add pronunciation within a structured text data file. The Speech service doesn't support training a model where you select both of those datasets as input.
@@ -259,7 +259,7 @@ Use <a href="http://sox.sourceforge.net" target="_blank" rel="noopener">SoX</a>
259259

260260
### Audio data for training
261261

262-
Not all base models support [training with audio data](language-support.md#speech-to-text). For a list of base models that support training with audio data, see [Language support](language-support.md#speech-to-text).
262+
Not all base models support [training with audio data](language-support.md?tabs=stt-tts). For a list of base models that support training with audio data, see [Language support](language-support.md?tabs=stt-tts).
263263

264264
Even if a base model supports training with audio data, the service might use only part of the audio. In [regions](regions.md#speech-service) with dedicated hardware available for training audio data, the Speech service will use up to 20 hours of your audio training data. In other regions, the Speech service uses up to 8 hours of your audio data.
265265

0 commit comments

Comments
 (0)