Skip to content

Commit dd77d4f

Browse files
committed
update docs
1 parent 915a3cd commit dd77d4f

File tree

3 files changed

+12
-10
lines changed

3 files changed

+12
-10
lines changed

articles/cognitive-services/Speech-Service/conversation-transcription.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,10 +50,16 @@ This is a high-level overview of how Conversation Transcription works.
5050
## Expected inputs
5151

5252
- **Multi-channel audio stream** – For specification and design details, see [Microsoft Speech Device SDK Microphone](./speech-devices-sdk-microphone.md). To learn more or purchase a development kit, see [Get Microsoft Speech Device SDK](./get-speech-devices-sdk.md).
53-
- **User voice samples** – Conversation Transcription needs user profiles in advance of the conversation. You will need to collect audio recordings from each user, then send the recordings to the [Signature Generation Service](https://aka.ms/cts/signaturegenservice) to validate the audio and generate user profiles.
53+
- **User voice samples** – Conversation Transcription needs user profiles in advance of the conversation for speaker identification. You will need to collect audio recordings from each user, then send the recordings to the [Signature Generation Service](https://aka.ms/cts/signaturegenservice) to validate the audio and generate user profiles.
5454

5555
> [!NOTE]
56-
> User voice samples are optional. Without voice input but have `DifferentiateGuestSpeakers` enabled, the transcription will still show different speakers, but shown as "Speaker1", "Speaker2", etc. instead of recognizing as pre-enrolled specific speaker names. For more information about setting `DifferentiateGuestSpeakers`, please refer to sample codes in [Real-Time Conversation Transcription Quickstart](.\includes\how-to\conversation-transcription\real-time-csharp.md).
56+
> User voice samples are required for speaker identification. Speakers who do not have voice samples will be recognized as "Unidentified". Unidentified speakers can still be differentiated when the `DifferentiateGuestSpeakers` property is enabled (see example below). The transcription output will then show speakers as "Guest_0", "Guest_1", etc. instead of recognizing as pre-enrolled specific speaker names.
57+
> ```csharp
58+
> config.SetProperty("DifferentiateGuestSpeakers", "true");
59+
> ```
60+
> ```javascript
61+
> speechTranslationConfig.setProperty("DifferentiateGuestSpeakers", "true");
62+
> ```
5763
5864
5965
## Real-time vs. asynchronous

articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-csharp.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,9 @@ This sample code does the following:
105105
* Creates an `AudioConfig` from the sample `.wav` file to transcribe.
106106
* Creates a `Conversation` using `CreateConversationAsync()`.
107107
* Creates a `ConversationTranscriber` using the constructor, and subscribes to the necessary events.
108-
* Enables `DifferentiateGuestSpeakers` feature to show the different speakers.
109108
* Adds participants to the conversation. The strings `voiceSignatureStringUser1` and `voiceSignatureStringUser2` should come as output from the steps above from the function `GetVoiceSignatureString()`.
110109
* Joins the conversation and begins transcription.
110+
* If you want to differentiate speakers but no voice samples input, please enables `DifferentiateGuestSpeakers` feature as in [Conversation Transcription Overview](../../../conversation-transcription.md).
111111

112112
> [!NOTE]
113113
> `AudioStreamReader` is a helper class you can get on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/quickstart/csharp/dotnet/conversation-transcription/helloworld/AudioStreamReader.cs).
@@ -135,10 +135,7 @@ public static async Task TranscribeConversationsAsync(string voiceSignatureStrin
135135
var config = SpeechConfig.FromSubscription(subscriptionKey, region);
136136
config.SetProperty("ConversationTranscriptionInRoomAndOnline", "true");
137137

138-
// This will enable "differentiate speakers" feature. You could comment it if you want to disable the feature.
139-
config.SetProperty("DifferentiateGuestSpeakers", "true");
140-
141-
// en-us by default. This code specifies Chinese.
138+
// en-us by default. Adding this code to specify other languages, like zh-cn.
142139
// config.SpeechRecognitionLanguage = "zh-cn";
143140
var stopRecognition = new TaskCompletionSource<int>();
144141

articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-javascript.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ This sample code does the following:
7272
* Creates a `ConversationTranscriber` using the constructor.
7373
* Adds participants to the conversation. The strings `voiceSignatureStringUser1` and `voiceSignatureStringUser2` should come as output from the steps above.
7474
* Registers to events and begins transcription.
75+
* If you want to differentiate speakers but no voice samples input, please enables `DifferentiateGuestSpeakers` feature as in [Conversation Transcription Overview](../../../conversation-transcription.md).
7576

7677
```javascript
7778
(function() {
@@ -95,9 +96,7 @@ This sample code does the following:
9596
var audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);
9697
speechTranslationConfig.setProperty("ConversationTranscriptionInRoomAndOnline", "true");
9798

98-
// This will enable "differentiate speakers" feature. You could comment it if you want to disable the feature.
99-
speechTranslationConfig.setProperty("DifferentiateGuestSpeakers", "true");
100-
99+
// en-us by default. Adding this code to specify other languages, like zh-cn.
101100
speechTranslationConfig.speechRecognitionLanguage = "en-US";
102101

103102
// create conversation and transcriber

0 commit comments

Comments
 (0)