Docs Editor: Update language-identification.md

eric-urban · eric-urban · commit 52293947f669 · 2022-03-15T14:57:26.000-07:00
diff --git a/articles/cognitive-services/Speech-Service/language-identification.md b/articles/cognitive-services/Speech-Service/language-identification.md
@@ -13,144 +13,159 @@ ms.author: eur
 zone_pivot_groups: programming-languages-speech-services-nomore-variant
 ---
 
-# Language identification
+# Language identification (preview)
 
-Language identification is used to identify languages spoken in audio when compared against a list of [supported languages](language-support.md#language-identification). 
+Language identification is used to identify languages spoken in audio when compared against a list of [supported languages](language-support.md#language-identification).
 
 Language identification (LID) use cases include:
 
 * [Standalone language identification](#standalone-language-identification) when you only need to identify the language in an audio source.
-* [Speech-to-text recognition](#speech-to-text) when you need to identify the language in an audio source and then transcribe it to text. 
-* [Speech translation](#speech-translation) when you need to identify the language in an audio source and then translate it to another language. 
+* [Speech-to-text recognition](#speech-to-text) when you need to identify the language in an audio source and then transcribe it to text.
+* [Speech translation](#speech-translation) when you need to identify the language in an audio source and then translate it to another language.
 
-Note that for speech recognition, the initial latency is higher with language identification. You should only include this optional feature as needed.   
+Note that for speech recognition, the initial latency is higher with language identification. You should only include this optional feature as needed.
 
 ## Configuration options
 
-Whether you use language identification [on its own](#standalone-language-identification), with [speech-to-text](#speech-to-text), or with [speech translation](#speech-translation), there are some common concepts and configuration options. 
+Whether you use language identification [on its own](#standalone-language-identification), with [speech-to-text](#speech-to-text), or with [speech translation](#speech-translation), there are some common concepts and configuration options.
 
 - Define a list of [candidate languages](#candidate-languages) that you expect in the audio.
 - Decide whether to use [at-start or continuous](#at-start-and-continuous-language-identification) language identification.
 - Prioritize [low latency or high accuracy](#accuracy-and-latency-prioritization) of results.
 
-Then you make a [recognize once or continuous recognition](#recognize-once-or-continuous) request to the Speech service. 
+Then you make a [recognize once or continuous recognition](#recognize-once-or-continuous) request to the Speech service.
 
 Code snippets are included with the concepts described next. Complete samples for each use case are provided further below.
 
 ### Candidate languages
 
-You provide candidate languages, at least one of which is expected be in the audio. You can include up to 4 languages for [at-start LID](#at-start-and-continuous-language-identification) or up to 10 languages for [continuous LID](#at-start-and-continuous-language-identification).  
+You provide candidate languages, at least one of which is expected be in the audio. You can include up to 4 languages for [at-start LID](#at-start-and-continuous-language-identification) or up to 10 languages for [continuous LID](#at-start-and-continuous-language-identification).
 
-You must provide the full 4-letter locale, but language identification only uses one locale per base language. Do not include multiple locales (e.g., "en-US" and "en-GB") for the same language. 
+You must provide the full 4-letter locale, but language identification only uses one locale per base language. Do not include multiple locales (e.g., "en-US" and "en-GB") for the same language.
 
 ::: zone pivot="programming-language-csharp"
+
 ```csharp
 var autoDetectSourceLanguageConfig =
     AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-cpp"
+
 ```cpp
 auto autoDetectSourceLanguageConfig = 
     AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE", "zh-CN" });
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-python"
+
 ```python
 auto_detect_source_language_config = \
     speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE", "zh-CN"])
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-java"
+
 ```java
 AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
     AutoDetectSourceLanguageConfig.fromLanguages(Arrays.asList("en-US", "de-DE", "zh-CN"));
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-javascript"
+
 ```javascript
 var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages([("en-US", "de-DE", "zh-CN"]);
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-objectivec"
+
 ```objective-c
 NSArray *languages = @[@"en-US", @"de-DE", @"zh-CN"];
 SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
     [[SPXAutoDetectSourceLanguageConfiguration alloc]init:languages];
 ```
+
 ::: zone-end
 
-For more information, see [supported languages](language-support.md#language-identification). 
+For more information, see [supported languages](language-support.md#language-identification).
 
 ### At-start and Continuous language identification
 
-Speech supports both at-start and continuous language identification (LID). 
+Speech supports both at-start and continuous language identification (LID).
 
 > [!NOTE]
 > Continuous language identification is only supported with Speech SDKs in C#, C++, and Python.
-
 - At-start LID identifies the language once within the first few seconds of audio. Use at-start LID if the language in the audio won't change.
-- Continuous LID can identify multiple languages for the duration of the audio. Use continuous LID if the language in the audio could change. Continuous LID does not support changing languages within the same sentence. For example, if you are primarily speaking Spanish and insert some English words, it will not detect the language change per word. 
+- Continuous LID can identify multiple languages for the duration of the audio. Use continuous LID if the language in the audio could change. Continuous LID does not support changing languages within the same sentence. For example, if you are primarily speaking Spanish and insert some English words, it will not detect the language change per word.
 
 You implement at-start LID or continuous LID by calling methods for [recognize once or continuous](#recognize-once-or-continuous). Results also depend upon your [Accuracy and Latency prioritization](#accuracy-and-latency-prioritization).
 
 ### Accuracy and Latency prioritization
 
-You can choose to prioritize accuracy or latency with language identification. 
+You can choose to prioritize accuracy or latency with language identification.
 
 > [!NOTE]
 > Latency is prioritized by default with the Speech SDK. You can choose to prioritize accuracy or latency with the Speech SDKs for C#, C++, and Python.
+Prioritize `Latency` if you need a low-latency result such as during live streaming. Set the priority to `Accuracy` if the audio quality may be poor, and more latency is acceptable. For example, a voicemail could have background noise, or some silence at the beginning. Allowing the engine more time will improve language identification results.
 
-Prioritize `Latency` if you need a low-latency result such as during live streaming. Set the priority to `Accuracy` if the audio quality may be poor, and more latency is acceptable. For example, a voicemail could have background noise, or some silence at the beginning. Allowing the engine more time will improve language identification results. 
-
-* **At-start:** With at-start LID in `Latency` mode the result is returned in less than 5 seconds. With at-start LID in `Accuracy` mode the result is returned within 30 seconds. You set the priority for at-start LID with the `SpeechServiceConnection_SingleLanguageIdPriority` property.   
-* **Continuous:** With continuous LID in `Latency` mode the results are returned every 2 seconds for the duration of the audio. With continuous LID in `Accuracy` mode the results are returned within no set time frame for the duration of the audio. You set the priority for continuous LID with the `SpeechServiceConnection_ContinuousLanguageIdPriority` property. 
+* **At-start:** With at-start LID in `Latency` mode the result is returned in less than 5 seconds. With at-start LID in `Accuracy` mode the result is returned within 30 seconds. You set the priority for at-start LID with the `SpeechServiceConnection_SingleLanguageIdPriority` property.
+* **Continuous:** With continuous LID in `Latency` mode the results are returned every 2 seconds for the duration of the audio. With continuous LID in `Accuracy` mode the results are returned within no set time frame for the duration of the audio. You set the priority for continuous LID with the `SpeechServiceConnection_ContinuousLanguageIdPriority` property.
 
 > [!IMPORTANT]
 > With [speech-to-text](#speech-to-text) and [speech translation](#speech-translation) continuous recognition, do not set `Accuracy`with the SpeechServiceConnection_ContinuousLanguageIdPriority property. The setting will be ignored without error, and the default priority of `Latency` will remain in effect. Only [standalone language identification](#standalone-language-identification) supports continuous LID with `Accuracy` prioritization.  
-
-Speech uses at-start LID with `Latency` prioritization by default. You need to set a priority property for any other LID configuration.  
+Speech uses at-start LID with `Latency` prioritization by default. You need to set a priority property for any other LID configuration.
 
 ::: zone pivot="programming-language-csharp"
 Here is an example of using continuous LID while still prioritizing latency.
+
 ```csharp
 speechConfig.SetProperty(PropertyId.SpeechServiceConnection_ContinuousLanguageIdPriority, "Latency");
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-cpp"
 Here is an example of using continuous LID while still prioritizing latency.
+
 ```cpp
 speechConfig->SetProperty(PropertyId::SpeechServiceConnection_ContinuousLanguageIdPriority, "Latency");
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-python"
 Here is an example of using continuous LID while still prioritizing latency.
+
 ```python
 speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnection_ContinuousLanguageIdPriority, value='Latency')
 ```
+
 ::: zone-end
 
 When prioritizing `Latency`, the Speech service returns one of the candidate languages provided even if those languages were not in the audio. For example, if `fr-FR` (French) and `en-US` (English) are provided as candidates, but German is spoken, either `fr-FR` or `en-US` would be returned. When prioritizing `Accuracy`, the Speech service will return the string `Unknown` as the detected language if none of the candidate languages are detected or if the language identification confidence is low.
 
 > [!NOTE]
 > You may see cases where an empty string will be returned instead of `Unknown`, due to Speech service inconsistency.
 > While this note is present, applications should check for both the `Unknown` and empty string case and treat them identically.
-
 ### Recognize once or continuous
 
-Language identification is completed with recognition objects and operations. You will make a request to the Speech service for recognition of audio. 
+Language identification is completed with recognition objects and operations. You will make a request to the Speech service for recognition of audio.
 
 > [!NOTE]
 > Don't confuse recognition with identification. Recognition can be used with or without language identification.
-
 Let's map these concepts to the code. You will either call the recognize once method, or the start and stop continuous recognition methods. You choose from:
+
 - Recognize once with at-start LID
 - Continuous recognition with at start LID
-- Continuous recognition with continuous LID 
+- Continuous recognition with continuous LID
 
-The `SpeechServiceConnection_ContinuousLanguageIdPriority` property is always required for continuous LID. Without it the speech service defaults to at-start lid. 
+The `SpeechServiceConnection_ContinuousLanguageIdPriority` property is always required for continuous LID. Without it the speech service defaults to at-start lid.
 
 ::: zone pivot="programming-language-csharp"
+
 ```csharp
 // Recognize once with At-start LID
 var result = await recognizer.RecognizeOnceAsync();
@@ -164,8 +179,10 @@ speechConfig.SetProperty(PropertyId.SpeechServiceConnection_ContinuousLanguageId
 await recognizer.StartContinuousRecognitionAsync();
 await recognizer.StopContinuousRecognitionAsync();
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-cpp"
+
 ```cpp
 // Recognize once with At-start LID
 auto result = recognizer->RecognizeOnceAsync().get();
@@ -179,8 +196,10 @@ speechConfig->SetProperty(PropertyId::SpeechServiceConnection_ContinuousLanguage
 recognizer->StartContinuousRecognitionAsync().get();
 recognizer->StopContinuousRecognitionAsync().get();
 ```
+
 ::: zone-end
 ::: zone pivot="programming-language-python"
+
 ```python
 # Recognize once with At-start LID
 result = recognizer.recognize_once()
@@ -194,23 +213,21 @@ speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnect
 source_language_recognizer.start_continuous_recognition()
 source_language_recognizer.stop_continuous_recognition()
 ```
-::: zone-end
 
+::: zone-end
 
 ## Standalone language identification
 
-You use standalone language identification when you only need to identify the language in an audio source. 
+You use standalone language identification when you only need to identify the language in an audio source.
 
 > [!NOTE]
 > Standalone source language identification is only supported with the Speech SDKs for C#, C++, and Python.
-
 ::: zone pivot="programming-language-csharp"
 
 ### [Recognize once](#tab/once)
 
 :::code language="csharp" source="~/samples-cognitive-services-speech-sdk/samples/csharp/sharedcontent/console/standalone_language_detection_samples.cs" id="languageDetectionInAccuracyWithFile":::
 
-
 ### [Continuous recognition](#tab/continuous)
 
 :::code language="csharp" source="~/samples-cognitive-services-speech-sdk/samples/csharp/sharedcontent/console/standalone_language_detection_samples.cs" id="languageDetectionContinuousWithFile":::
@@ -235,39 +252,31 @@ See more examples of standalone language identification on [GitHub](https://gith
 
 See more examples of standalone language identification on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/cpp/windows/console/samples/standalone_language_detection_samples.cpp).
 
-
 ::: zone-end
 
 ::: zone pivot="programming-language-python"
 
 ### [Recognize once](#tab/once)
 
-
 :::code language="python" source="~/samples-cognitive-services-speech-sdk/samples/python/console/speech_language_detection_sample.py" id="SpeechLanguageDetectionWithFile":::
 
-
 ### [Continuous recognition](#tab/continuous)
 
-
 :::code language="python" source="~/samples-cognitive-services-speech-sdk/samples/python/console/speech_language_detection_sample.py" id="SpeechContinuousLanguageDetectionWithFile":::
 
 ---
 
 See more examples of standalone language identification on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/python/console/speech_language_detection_sample.py).
 
-
 ::: zone-end
 
-
 ## Speech-to-text
 
 You use Speech-to-text recognition when you need to identify the language in an audio source and then transcribe it to text. For more information, see [Speech-to-text overview](speech-to-text.md).
 
 > [!NOTE]
 > Speech-to-text recognition with at-start language identification is supported with Speech SDKs in C#, C++, Python, Java, JavaScript, and Objective-C. Speech-to-text recognition with continuous language identification is only supported with Speech SDKs in C#, C++, and Python.
-> 
 > Currently for speech-to-text recognition with continuous language identification, you must create a SpeechConfig from the `wss://{region}.stt.speech.microsoft.com/speech/universal/v2` endpoint string, as shown in code examples. In a future SDK release you won't need to set it.
-
 ::: zone pivot="programming-language-csharp"
 
 ### [Recognize once](#tab/once)
@@ -389,7 +398,6 @@ See more examples of speech-to-text recognition with language identification on
 
 ::: zone pivot="programming-language-cpp"
 
-
 ### [Recognize once](#tab/once)
 
 ```cpp
@@ -414,14 +422,12 @@ auto autoDetectSourceLanguageResult =
 auto detectedLanguage = autoDetectSourceLanguageResult->Language;
 ```
 
-
 ### [Continuous recognition](#tab/continuous)
 
 :::code language="cpp" source="~/samples-cognitive-services-speech-sdk/samples/cpp/windows/console/samples/speech_recognition_samples.cpp" id="SpeechContinuousRecognitionAndLanguageIdWithMultiLingualFile":::
 
 See more examples of speech-to-text recognition with language identification on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/cpp/windows/console/samples/speech_recognition_samples.cpp).
 
-
 ::: zone-end
 
 ::: zone pivot="programming-language-java"
@@ -454,7 +460,6 @@ See more examples of speech-to-text recognition with language identification on
 
 ::: zone pivot="programming-language-python"
 
-
 ### [Recognize once](#tab/once)
 
 ```Python
@@ -469,10 +474,8 @@ auto_detect_source_language_result = speechsdk.AutoDetectSourceLanguageResult(re
 detected_language = auto_detect_source_language_result.language
 ```
 
-
 ### [Continuous recognition](#tab/continuous)
 
-
 ```python
 import azure.cognitiveservices.speech as speechsdk
 import time
@@ -539,7 +542,6 @@ SPXAutoDetectSourceLanguageResult *languageDetectionResult = [[SPXAutoDetectSour
 NSString *detectedLanguage = [languageDetectionResult language];
 ```
 
-
 ::: zone-end
 
 ::: zone pivot="programming-language-javascript"
@@ -556,7 +558,6 @@ speechRecognizer.recognizeOnceAsync((result: SpeechSDK.SpeechRecognitionResult)
 
 ::: zone-end
 
-
 ### Using Speech-to-text custom models
 
 ::: zone pivot="programming-language-csharp"
@@ -647,19 +648,15 @@ var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fr
 
 ::: zone-end
 
-
 ## Speech translation
 
 You use Speech translation when you need to identify the language in an audio source and then translate it to another language. For more information, see [Speech translation overview](speech-translation.md).
 
 > [!NOTE]
 > Speech translation with language identification is only supported with Speech SDKs in C#, C++, and Python. 
-> 
 > Currently for speech translation with language identification, you must create a SpeechConfig from the `wss://{region}.stt.speech.microsoft.com/speech/universal/v2` endpoint string, as shown in code examples. In a future SDK release you won't need to set it.
-
 ::: zone pivot="programming-language-csharp"
 
-
 ### [Recognize once](#tab/once)
 
 ```csharp
@@ -991,7 +988,6 @@ See more examples of speech translation with language identification on [GitHub]
 
 ::: zone pivot="programming-language-python"
 
-
 ### [Recognize once](#tab/once)
 
 ```python