update docs

jimxiei · jimxiei · commit dd77d4fd7688 · 2021-10-07T13:20:42.000-07:00
diff --git a/articles/cognitive-services/Speech-Service/conversation-transcription.md b/articles/cognitive-services/Speech-Service/conversation-transcription.md
@@ -50,10 +50,16 @@ This is a high-level overview of how Conversation Transcription works.
 ## Expected inputs
 
 - **Multi-channel audio stream** – For specification and design details, see [Microsoft Speech Device SDK Microphone](./speech-devices-sdk-microphone.md). To learn more or purchase a development kit, see [Get Microsoft Speech Device SDK](./get-speech-devices-sdk.md).
-- **User voice samples** – Conversation Transcription needs user profiles in advance of the conversation. You will need to collect audio recordings from each user, then send the recordings to the [Signature Generation Service](https://aka.ms/cts/signaturegenservice) to validate the audio and generate user profiles.
+- **User voice samples** – Conversation Transcription needs user profiles in advance of the conversation for speaker identification. You will need to collect audio recordings from each user, then send the recordings to the [Signature Generation Service](https://aka.ms/cts/signaturegenservice) to validate the audio and generate user profiles.
 
 > [!NOTE]
-> User voice samples are optional. Without voice input but have `DifferentiateGuestSpeakers` enabled, the transcription will still show different speakers, but shown as "Speaker1", "Speaker2", etc. instead of recognizing as pre-enrolled specific speaker names. For more information about setting `DifferentiateGuestSpeakers`, please refer to sample codes in [Real-Time Conversation Transcription Quickstart](.\includes\how-to\conversation-transcription\real-time-csharp.md).
+> User voice samples are required for speaker identification. Speakers who do not have voice samples will be recognized as "Unidentified". Unidentified speakers can still be differentiated when the `DifferentiateGuestSpeakers` property is enabled (see example below). The transcription output will then show speakers as "Guest_0", "Guest_1", etc. instead of recognizing as pre-enrolled specific speaker names.
+> ```csharp
+> config.SetProperty("DifferentiateGuestSpeakers", "true");
+> ```
+> ```javascript
+> speechTranslationConfig.setProperty("DifferentiateGuestSpeakers", "true");
+> ```
 
 
 ## Real-time vs. asynchronous
diff --git a/articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-csharp.md b/articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-csharp.md
@@ -105,9 +105,9 @@ This sample code does the following:
 * Creates an `AudioConfig` from the sample `.wav` file to transcribe.
 * Creates a `Conversation` using `CreateConversationAsync()`.
 * Creates a `ConversationTranscriber` using the constructor, and subscribes to the necessary events.
-* Enables `DifferentiateGuestSpeakers` feature to show the different speakers.
 * Adds participants to the conversation. The strings `voiceSignatureStringUser1` and `voiceSignatureStringUser2` should come as output from the steps above from the function `GetVoiceSignatureString()`.
 * Joins the conversation and begins transcription.
+* If you want to differentiate speakers but no voice samples input, please enables `DifferentiateGuestSpeakers` feature as in [Conversation Transcription Overview](../../../conversation-transcription.md). 
 
 > [!NOTE]
 > `AudioStreamReader` is a helper class you can get on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/quickstart/csharp/dotnet/conversation-transcription/helloworld/AudioStreamReader.cs).
@@ -135,10 +135,7 @@ public static async Task TranscribeConversationsAsync(string voiceSignatureStrin
     var config = SpeechConfig.FromSubscription(subscriptionKey, region);
     config.SetProperty("ConversationTranscriptionInRoomAndOnline", "true");
 
-    // This will enable "differentiate speakers" feature. You could comment it if you want to disable the feature.
-    config.SetProperty("DifferentiateGuestSpeakers", "true");
-
-    // en-us by default. This code specifies Chinese.
+    // en-us by default. Adding this code to specify other languages, like zh-cn.
     // config.SpeechRecognitionLanguage = "zh-cn";
     var stopRecognition = new TaskCompletionSource<int>();
 
diff --git a/articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-javascript.md b/articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-javascript.md
@@ -72,6 +72,7 @@ This sample code does the following:
 * Creates a `ConversationTranscriber` using the constructor.
 * Adds participants to the conversation. The strings `voiceSignatureStringUser1` and `voiceSignatureStringUser2` should come as output from the steps above.
 * Registers to events and begins transcription.
+* If you want to differentiate speakers but no voice samples input, please enables `DifferentiateGuestSpeakers` feature as in [Conversation Transcription Overview](../../../conversation-transcription.md). 
 
 ```javascript
 (function() {
@@ -95,9 +96,7 @@ This sample code does the following:
     var audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);
     speechTranslationConfig.setProperty("ConversationTranscriptionInRoomAndOnline", "true");
 
-    // This will enable "differentiate speakers" feature. You could comment it if you want to disable the feature.
-    speechTranslationConfig.setProperty("DifferentiateGuestSpeakers", "true");
-
+    // en-us by default. Adding this code to specify other languages, like zh-cn.
     speechTranslationConfig.speechRecognitionLanguage = "en-US";
     
     // create conversation and transcriber