Skip to content

Commit 6e625ba

Browse files
authored
Merge pull request #186532 from paulth1/get-started-speaker-recognition
edit pass: Get started speaker recognition
2 parents 8c3a145 + b372581 commit 6e625ba

File tree

6 files changed

+137
-141
lines changed

6 files changed

+137
-141
lines changed

articles/cognitive-services/Speech-Service/get-started-speaker-recognition.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: "Speaker Recognition quickstart - Speech service"
33
titleSuffix: Azure Cognitive Services
4-
description: Learn how to use Speaker Recognition from the Speech SDK to answer the question, "who is speaking". In this quickstart, you learn about common design patterns for working with both speaker verification and identification, which both use voice biometry to identify unique voices.
4+
description: Learn how to use speaker recognition from the Speech SDK to answer the question, "Who is speaking?". In this quickstart, you learn about common design patterns for working with speaker verification and identification, which both use voice biometry to identify unique voices.
55
services: cognitive-services
66
author: eric-urban
77
manager: nitinme
@@ -16,7 +16,7 @@ zone_pivot_groups: programming-languages-set-twenty-five
1616
keywords: speaker recognition, voice biometry
1717
---
1818

19-
# Get started with Speaker Recognition
19+
# Get started with speaker recognition
2020

2121
::: zone pivot="programming-language-csharp"
2222
[!INCLUDE [C# Basics include](includes/how-to/speaker-recognition-basics/speaker-recognition-basics-csharp.md)]

articles/cognitive-services/Speech-Service/includes/how-to/speaker-recognition-basics/speaker-recognition-basics-cpp.md

Lines changed: 35 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -7,149 +7,148 @@ ms.author: v-jawe
77
ms.custom: references_regions, ignite-fall-2021
88
---
99

10-
In this quickstart, you learn basic design patterns for Speaker Recognition using the Speech SDK, including:
10+
In this quickstart, you learn basic design patterns for speaker recognition by using the Speech SDK, including:
1111

12-
* Text-dependent and text-independent verification
13-
* Speaker identification to identify a voice sample among a group of voices
14-
* Deleting voice profiles
15-
16-
For a high-level look at Speaker Recognition concepts, see the [overview](../../../speaker-recognition-overview.md) article. See the Reference node on left nav for a list of the supported platforms.
12+
* Text-dependent and text-independent verification.
13+
* Speaker identification to identify a voice sample among a group of voices.
14+
* Deleting voice profiles.
1715

16+
For a high-level look at speaker recognition concepts, see the [Overview](../../../speaker-recognition-overview.md) article. See the **Reference node** in the left pane for a list of the supported platforms.
1817

1918
## Prerequisites
2019

2120
This article assumes that you have an Azure account and Speech service subscription. If you don't have an account and subscription, [try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
2221

2322
> [!IMPORTANT]
24-
> Microsoft limits access to Speaker Recognition. Apply to use it through the [Azure Cognitive Services Speaker Recognition Limited Access Review](https://aka.ms/azure-speaker-recognition). After approval, you can access the Speaker Recognition APIs.
23+
> Microsoft limits access to speaker recognition. Apply to use it through the [Azure Cognitive Services Speaker Recognition Limited Access Review](https://aka.ms/azure-speaker-recognition) form. After approval, you can access the Speaker Recognition APIs.
2524
26-
## Install the Speech SDK
25+
### Install the Speech SDK
2726

28-
Before you can do anything, you'll need to install the Speech SDK. Depending on your platform, use the following instructions:
27+
Before you start, you must install the Speech SDK. Depending on your platform, use the following instructions:
2928

3029
* <a href="/azure/cognitive-services/speech-service/quickstarts/setup-platform?pivots=programming-language-cpp&tabs=linux" target="_blank">Linux </a>
3130
* <a href="/azure/cognitive-services/speech-service/quickstarts/setup-platform?pivots=programming-language-cpp&tabs=macos" target="_blank">macOS </a>
3231
* <a href="/azure/cognitive-services/speech-service/quickstarts/setup-platform?pivots=programming-language-cpp&tabs=windows" target="_blank">Windows </a>
3332

34-
## Import dependencies
33+
### Import dependencies
3534

36-
To run the examples in this article, add the following statements at the top of your .cpp file.
35+
To run the examples in this article, add the following statements at the top of your .cpp file:
3736

3837
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="dependencies":::
3938

4039
## Create a speech configuration
4140

42-
To call the Speech service using the Speech SDK, you need to create a [`SpeechConfig`](/cpp/cognitive-services/speech/speechconfig). This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
41+
To call the Speech service by using the Speech SDK, create a [`SpeechConfig`](/cpp/cognitive-services/speech/speechconfig) class. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
4342

4443
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="get_speech_config":::
4544

4645
## Text-dependent verification
4746

48-
Speaker Verification is the act of confirming that a speaker matches a known, or **enrolled** voice. The first step is to **enroll** a voice profile, so that the service has something to compare future voice samples against. In this example, you enroll the profile using a **text-dependent** strategy, which requires a specific passphrase to use for both enrollment and verification. See the [reference docs](/rest/api/speakerrecognition/) for a list of supported passphrases.
47+
Speaker verification is the act of confirming that a speaker matches a known, or *enrolled*, voice. The first step is to enroll a voice profile so that the service has something to compare future voice samples against. In this example, you enroll the profile by using a *text-dependent* strategy, which requires a specific passphrase to use for enrollment and verification. See the [reference docs](/rest/api/speakerrecognition/) for a list of supported passphrases.
4948

5049
### TextDependentVerification function
5150

52-
Start by creating the `TextDependentVerification` function.
51+
Start by creating the `TextDependentVerification` function:
5352

5453
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="text_dependent_verification":::
5554

56-
This function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method. Note there are three [types](/cpp/cognitive-services/speech/microsoft-cognitiveservices-speech-namespace#enum-voiceprofiletype) of `VoiceProfile`:
55+
This function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method. There are three [types](/cpp/cognitive-services/speech/microsoft-cognitiveservices-speech-namespace#enum-voiceprofiletype) of `VoiceProfile`:
5756

5857
- TextIndependentIdentification
5958
- TextDependentVerification
6059
- TextIndependentVerification
6160

62-
In this case you pass `VoiceProfileType::TextDependentVerification` to `CreateProfileAsync`.
61+
In this case, you pass `VoiceProfileType::TextDependentVerification` to `CreateProfileAsync`.
6362

6463
You then call two helper functions that you'll define next, `AddEnrollmentsToTextDependentProfile` and `SpeakerVerify`. Finally, call [DeleteProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#deleteprofileasync) to clean up the profile.
6564

6665
### AddEnrollmentsToTextDependentProfile function
6766

68-
Define the following function to enroll a voice profile.
67+
Define the following function to enroll a voice profile:
6968

7069
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="add_enrollments_dependent":::
7170

72-
In this function, you enroll audio samples in a `while` loop that tracks the number of samples remaining, and required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak the passphrase into your microphone, and adds the sample to the voice profile.
71+
In this function, you enroll audio samples in a `while` loop that tracks the number of samples remaining, and that are required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak the passphrase into your microphone, and it adds the sample to the voice profile.
7372

7473
### SpeakerVerify function
7574

76-
Define `SpeakerVerify` as follows.
75+
Define `SpeakerVerify` as follows:
7776

7877
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="speaker_verify":::
7978

8079
In this function, you create a [SpeakerVerificationModel](/cpp/cognitive-services/speech/speaker-speakerverificationmodel) object with the [SpeakerVerificationModel::FromProfile](/cpp/cognitive-services/speech/speaker-speakerverificationmodel#fromprofile) method, passing in the [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object you created earlier.
8180

82-
Next, [SpeechRecognizer::RecognizeOnceAsync](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync) prompts you to speak the passphrase again, but this time it will validate it against your voice profile and return a similarity score ranging from 0.0-1.0. The [SpeakerRecognitionResult](/cpp/cognitive-services/speech/speaker-speakerrecognitionresult) object also returns `Accept` or `Reject`, based on whether or not the passphrase matches.
81+
Next, [SpeechRecognizer::RecognizeOnceAsync](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync) prompts you to speak the passphrase again. This time it validates it against your voice profile and returns a similarity score that ranges from 0.0 to 1.0. The [SpeakerRecognitionResult](/cpp/cognitive-services/speech/speaker-speakerrecognitionresult) object also returns `Accept` or `Reject` based on whether the passphrase matches.
8382

8483
## Text-independent verification
8584

86-
In contrast to **text-dependent** verification, **text-independent** verification does not require three audio samples, but *does* require 20 seconds of total audio.
85+
In contrast to *text-dependent* verification, *text-independent* verification doesn't require three audio samples but *does* require 20 seconds of total audio.
8786

8887
### TextIndependentVerification function
8988

90-
Start by creating the `TextIndependentVerification` function.
89+
Start by creating the `TextIndependentVerification` function:
9190

9291
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="text_independent_verification":::
9392

9493
Like the `TextDependentVerification` function, this function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method.
9594

96-
In this case you pass `VoiceProfileType::TextIndependentVerification` to `CreateProfileAsync`.
95+
In this case, you pass `VoiceProfileType::TextIndependentVerification` to `CreateProfileAsync`.
9796

9897
You then call two helper functions: `AddEnrollmentsToTextIndependentProfile`, which you'll define next, and `SpeakerVerify`, which you defined already. Finally, call [DeleteProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#deleteprofileasync) to clean up the profile.
9998

10099
### AddEnrollmentsToTextIndependentProfile
101100

102-
Define the following function to enroll a voice profile.
101+
Define the following function to enroll a voice profile:
103102

104103
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="add_enrollments_independent":::
105104

106-
In this function, you enroll audio samples in a `while` loop that tracks the number of seconds of audio remaining, and required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak into your microphone, and adds the sample to the voice profile.
105+
In this function, you enroll audio samples in a `while` loop that tracks the number of seconds of audio remaining, and that are required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak into your microphone, and it adds the sample to the voice profile.
107106

108107
## Speaker identification
109108

110-
Speaker Identification is used to determine **who** is speaking from a given group of enrolled voices. The process is very similar to **text-independent verification**, with the main difference being able to verify against multiple voice profiles at once, rather than verifying against a single profile.
109+
Speaker identification is used to determine *who* is speaking from a given group of enrolled voices. The process is similar to *text-independent verification*. The main difference is the capability to verify against multiple voice profiles at once rather than verifying against a single profile.
111110

112111
### TextIndependentIdentification function
113112

114-
Start by creating the `TextIndependentIdentification` function.
113+
Start by creating the `TextIndependentIdentification` function:
115114

116115
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="text_independent_indentification":::
117116

118117
Like the `TextDependentVerification` and `TextIndependentVerification` functions, this function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method.
119118

120-
In this case you pass `VoiceProfileType::TextIndependentIdentification` to `CreateProfileAsync`.
119+
In this case, you pass `VoiceProfileType::TextIndependentIdentification` to `CreateProfileAsync`.
121120

122121
You then call two helper functions: `AddEnrollmentsToTextIndependentProfile`, which you defined already, and `SpeakerIdentify`, which you'll define next. Finally, call [DeleteProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#deleteprofileasync) to clean up the profile.
123122

124123
### SpeakerIdentify function
125124

126-
Define the `SpeakerIdentify` function as follows.
125+
Define the `SpeakerIdentify` function as follows:
127126

128127
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="speaker_identify":::
129128

130-
In this function, you create a [SpeakerIdentificationModel](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel) object with the [SpeakerIdentificationModel::FromProfiles](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel#fromprofiles) method. `SpeakerIdentificationModel::FromProfiles` accepts a list of [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) objects. In this case, you'll just pass in the `VoiceProfile` object you created earlier. However, if you want, you can pass in multiple `VoiceProfile` objects, each enrolled with audio samples from a different voice.
129+
In this function, you create a [SpeakerIdentificationModel](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel) object with the [SpeakerIdentificationModel::FromProfiles](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel#fromprofiles) method. `SpeakerIdentificationModel::FromProfiles` accepts a list of [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) objects. In this case, you pass in the `VoiceProfile` object you created earlier. If you want, you can pass in multiple `VoiceProfile` objects, each enrolled with audio samples from a different voice.
131130

132131
Next, [SpeechRecognizer::RecognizeOnceAsync](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync) prompts you to speak again. This time it compares your voice to the enrolled voice profiles and returns the most similar voice profile.
133132

134133
## Main function
135134

136-
Finally, define the `main` function as follows.
135+
Finally, define the `main` function as follows:
137136

138137
:::code language="cpp" source="~/cognitive-services-quickstart-code/cpp/speech/speaker-recognition.cpp" id="main":::
139138

140-
This function simply calls the functions you defined previously. First, though, it creates a [VoiceProfileClient](/cpp/cognitive-services/speech/speaker-voiceprofileclient) object and a [SpeakerRecognizer](/cpp/cognitive-services/speech/speaker-speakerrecognizer) object.
139+
This function calls the functions you defined previously. First, it creates a [VoiceProfileClient](/cpp/cognitive-services/speech/speaker-voiceprofileclient) object and a [SpeakerRecognizer](/cpp/cognitive-services/speech/speaker-speakerrecognizer) object.
141140

142141
```
143142
auto speech_config = GetSpeechConfig();
144143
auto client = VoiceProfileClient::FromConfig(speech_config);
145144
auto recognizer = SpeakerRecognizer::FromConfig(speech_config, audio_config);
146145
```
147146

148-
The `VoiceProfileClient` is used to create, enroll and delete voice profiles. The `SpeakerRecognizer` is used to validate speech samples against one or more enrolled voice profiles.
147+
The `VoiceProfileClient` object is used to create, enroll, and delete voice profiles. The `SpeakerRecognizer` object is used to validate speech samples against one or more enrolled voice profiles.
149148

150-
## Changing audio input type
149+
## Change audio input type
151150

152-
The examples in this article use the default device microphone as input for audio samples. However, in scenarios where you need to use audio files instead of microphone input, simply change the following line:
151+
The examples in this article use the default device microphone as input for audio samples. In scenarios where you need to use audio files instead of microphone input, change the following line:
153152

154153
```
155154
auto audio_config = Audio::AudioConfig::FromDefaultMicrophoneInput();
@@ -161,4 +160,4 @@ to:
161160
auto audio_config = Audio::AudioConfig::FromWavFileInput(path/to/your/file.wav);
162161
```
163162

164-
Or replace any use of `audio_config` with [Audio::AudioConfig::FromWavFileInput](/cpp/cognitive-services/speech/audio-audioconfig#fromwavfileinput). You can also have mixed inputs, using a microphone for enrollment and files for verification, for example.
163+
Or replace any use of `audio_config` with [Audio::AudioConfig::FromWavFileInput](/cpp/cognitive-services/speech/audio-audioconfig#fromwavfileinput). You can also have mixed inputs by using a microphone for enrollment and files for verification, for example.

0 commit comments

Comments
 (0)