You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Learn how to use Speaker Recognition from the Speech SDK to answer the question, "who is speaking". In this quickstart, you learn about common design patterns for working with both speaker verification and identification, which both use voice biometry to identify unique voices.
4
+
description: Learn how to use speaker recognition from the Speech SDK to answer the question, "Who is speaking?". In this quickstart, you learn about common design patterns for working with speaker verification and identification, which both use voice biometry to identify unique voices.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/speaker-recognition-basics/speaker-recognition-basics-cpp.md
+35-36Lines changed: 35 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,149 +7,148 @@ ms.author: v-jawe
7
7
ms.custom: references_regions, ignite-fall-2021
8
8
---
9
9
10
-
In this quickstart, you learn basic design patterns for Speaker Recognition using the Speech SDK, including:
10
+
In this quickstart, you learn basic design patterns for speaker recognition by using the Speech SDK, including:
11
11
12
-
* Text-dependent and text-independent verification
13
-
* Speaker identification to identify a voice sample among a group of voices
14
-
* Deleting voice profiles
15
-
16
-
For a high-level look at Speaker Recognition concepts, see the [overview](../../../speaker-recognition-overview.md) article. See the Reference node on left nav for a list of the supported platforms.
12
+
* Text-dependent and text-independent verification.
13
+
* Speaker identification to identify a voice sample among a group of voices.
14
+
* Deleting voice profiles.
17
15
16
+
For a high-level look at speaker recognition concepts, see the [Overview](../../../speaker-recognition-overview.md) article. See the **Reference node** in the left pane for a list of the supported platforms.
18
17
19
18
## Prerequisites
20
19
21
20
This article assumes that you have an Azure account and Speech service subscription. If you don't have an account and subscription, [try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
22
21
23
22
> [!IMPORTANT]
24
-
> Microsoft limits access to Speaker Recognition. Apply to use it through the [Azure Cognitive Services Speaker Recognition Limited Access Review](https://aka.ms/azure-speaker-recognition). After approval, you can access the Speaker Recognition APIs.
23
+
> Microsoft limits access to speaker recognition. Apply to use it through the [Azure Cognitive Services Speaker Recognition Limited Access Review](https://aka.ms/azure-speaker-recognition) form. After approval, you can access the Speaker Recognition APIs.
25
24
26
-
## Install the Speech SDK
25
+
###Install the Speech SDK
27
26
28
-
Before you can do anything, you'll need to install the Speech SDK. Depending on your platform, use the following instructions:
27
+
Before you start, you must install the Speech SDK. Depending on your platform, use the following instructions:
To call the Speech service using the Speech SDK, you need to create a [`SpeechConfig`](/cpp/cognitive-services/speech/speechconfig). This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
41
+
To call the Speech service by using the Speech SDK, create a [`SpeechConfig`](/cpp/cognitive-services/speech/speechconfig) class. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
Speaker Verification is the act of confirming that a speaker matches a known, or **enrolled** voice. The first step is to **enroll** a voice profile, so that the service has something to compare future voice samples against. In this example, you enroll the profile using a **text-dependent** strategy, which requires a specific passphrase to use for both enrollment and verification. See the [reference docs](/rest/api/speakerrecognition/) for a list of supported passphrases.
47
+
Speaker verification is the act of confirming that a speaker matches a known, or *enrolled*, voice. The first step is to enroll a voice profile so that the service has something to compare future voice samples against. In this example, you enroll the profile by using a *text-dependent* strategy, which requires a specific passphrase to use for enrollment and verification. See the [reference docs](/rest/api/speakerrecognition/) for a list of supported passphrases.
49
48
50
49
### TextDependentVerification function
51
50
52
-
Start by creating the `TextDependentVerification` function.
51
+
Start by creating the `TextDependentVerification` function:
This function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method. Note there are three [types](/cpp/cognitive-services/speech/microsoft-cognitiveservices-speech-namespace#enum-voiceprofiletype) of `VoiceProfile`:
55
+
This function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method. There are three [types](/cpp/cognitive-services/speech/microsoft-cognitiveservices-speech-namespace#enum-voiceprofiletype) of `VoiceProfile`:
57
56
58
57
- TextIndependentIdentification
59
58
- TextDependentVerification
60
59
- TextIndependentVerification
61
60
62
-
In this case you pass `VoiceProfileType::TextDependentVerification` to `CreateProfileAsync`.
61
+
In this case, you pass `VoiceProfileType::TextDependentVerification` to `CreateProfileAsync`.
63
62
64
63
You then call two helper functions that you'll define next, `AddEnrollmentsToTextDependentProfile` and `SpeakerVerify`. Finally, call [DeleteProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#deleteprofileasync) to clean up the profile.
65
64
66
65
### AddEnrollmentsToTextDependentProfile function
67
66
68
-
Define the following function to enroll a voice profile.
67
+
Define the following function to enroll a voice profile:
In this function, you enroll audio samples in a `while` loop that tracks the number of samples remaining, and required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak the passphrase into your microphone, and adds the sample to the voice profile.
71
+
In this function, you enroll audio samples in a `while` loop that tracks the number of samples remaining, and that are required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak the passphrase into your microphone, and it adds the sample to the voice profile.
In this function, you create a [SpeakerVerificationModel](/cpp/cognitive-services/speech/speaker-speakerverificationmodel) object with the [SpeakerVerificationModel::FromProfile](/cpp/cognitive-services/speech/speaker-speakerverificationmodel#fromprofile) method, passing in the [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object you created earlier.
81
80
82
-
Next, [SpeechRecognizer::RecognizeOnceAsync](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync) prompts you to speak the passphrase again, but this time it will validate it against your voice profile and return a similarity score ranging from 0.0-1.0. The [SpeakerRecognitionResult](/cpp/cognitive-services/speech/speaker-speakerrecognitionresult) object also returns `Accept` or `Reject`, based on whether or not the passphrase matches.
81
+
Next, [SpeechRecognizer::RecognizeOnceAsync](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync) prompts you to speak the passphrase again. This time it validates it against your voice profile and returns a similarity score that ranges from 0.0 to 1.0. The [SpeakerRecognitionResult](/cpp/cognitive-services/speech/speaker-speakerrecognitionresult) object also returns `Accept` or `Reject` based on whether the passphrase matches.
83
82
84
83
## Text-independent verification
85
84
86
-
In contrast to **text-dependent** verification, **text-independent** verification does not require three audio samples, but *does* require 20 seconds of total audio.
85
+
In contrast to *text-dependent* verification, *text-independent* verification doesn't require three audio samples but *does* require 20 seconds of total audio.
87
86
88
87
### TextIndependentVerification function
89
88
90
-
Start by creating the `TextIndependentVerification` function.
89
+
Start by creating the `TextIndependentVerification` function:
Like the `TextDependentVerification` function, this function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method.
95
94
96
-
In this case you pass `VoiceProfileType::TextIndependentVerification` to `CreateProfileAsync`.
95
+
In this case, you pass `VoiceProfileType::TextIndependentVerification` to `CreateProfileAsync`.
97
96
98
97
You then call two helper functions: `AddEnrollmentsToTextIndependentProfile`, which you'll define next, and `SpeakerVerify`, which you defined already. Finally, call [DeleteProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#deleteprofileasync) to clean up the profile.
99
98
100
99
### AddEnrollmentsToTextIndependentProfile
101
100
102
-
Define the following function to enroll a voice profile.
101
+
Define the following function to enroll a voice profile:
In this function, you enroll audio samples in a `while` loop that tracks the number of seconds of audio remaining, and required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak into your microphone, and adds the sample to the voice profile.
105
+
In this function, you enroll audio samples in a `while` loop that tracks the number of seconds of audio remaining, and that are required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak into your microphone, and it adds the sample to the voice profile.
107
106
108
107
## Speaker identification
109
108
110
-
Speaker Identification is used to determine **who** is speaking from a given group of enrolled voices. The process is very similar to **text-independent verification**, with the main difference being able to verify against multiple voice profiles at once, rather than verifying against a single profile.
109
+
Speaker identification is used to determine *who* is speaking from a given group of enrolled voices. The process is similar to *text-independent verification*. The main difference is the capability to verify against multiple voice profiles at once rather than verifying against a single profile.
111
110
112
111
### TextIndependentIdentification function
113
112
114
-
Start by creating the `TextIndependentIdentification` function.
113
+
Start by creating the `TextIndependentIdentification` function:
Like the `TextDependentVerification` and `TextIndependentVerification` functions, this function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method.
119
118
120
-
In this case you pass `VoiceProfileType::TextIndependentIdentification` to `CreateProfileAsync`.
119
+
In this case, you pass `VoiceProfileType::TextIndependentIdentification` to `CreateProfileAsync`.
121
120
122
121
You then call two helper functions: `AddEnrollmentsToTextIndependentProfile`, which you defined already, and `SpeakerIdentify`, which you'll define next. Finally, call [DeleteProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#deleteprofileasync) to clean up the profile.
In this function, you create a [SpeakerIdentificationModel](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel) object with the [SpeakerIdentificationModel::FromProfiles](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel#fromprofiles) method. `SpeakerIdentificationModel::FromProfiles` accepts a list of [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) objects. In this case, you'll just pass in the `VoiceProfile` object you created earlier. However, if you want, you can pass in multiple `VoiceProfile` objects, each enrolled with audio samples from a different voice.
129
+
In this function, you create a [SpeakerIdentificationModel](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel) object with the [SpeakerIdentificationModel::FromProfiles](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel#fromprofiles) method. `SpeakerIdentificationModel::FromProfiles` accepts a list of [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) objects. In this case, youpass in the `VoiceProfile` object you created earlier. If you want, you can pass in multiple `VoiceProfile` objects, each enrolled with audio samples from a different voice.
131
130
132
131
Next, [SpeechRecognizer::RecognizeOnceAsync](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync) prompts you to speak again. This time it compares your voice to the enrolled voice profiles and returns the most similar voice profile.
This function simply calls the functions you defined previously. First, though, it creates a [VoiceProfileClient](/cpp/cognitive-services/speech/speaker-voiceprofileclient) object and a [SpeakerRecognizer](/cpp/cognitive-services/speech/speaker-speakerrecognizer) object.
139
+
This function calls the functions you defined previously. First, it creates a [VoiceProfileClient](/cpp/cognitive-services/speech/speaker-voiceprofileclient) object and a [SpeakerRecognizer](/cpp/cognitive-services/speech/speaker-speakerrecognizer) object.
141
140
142
141
```
143
142
auto speech_config = GetSpeechConfig();
144
143
auto client = VoiceProfileClient::FromConfig(speech_config);
145
144
auto recognizer = SpeakerRecognizer::FromConfig(speech_config, audio_config);
146
145
```
147
146
148
-
The `VoiceProfileClient` is used to create, enroll and delete voice profiles. The `SpeakerRecognizer` is used to validate speech samples against one or more enrolled voice profiles.
147
+
The `VoiceProfileClient`object is used to create, enroll, and delete voice profiles. The `SpeakerRecognizer` object is used to validate speech samples against one or more enrolled voice profiles.
149
148
150
-
## Changing audio input type
149
+
## Change audio input type
151
150
152
-
The examples in this article use the default device microphone as input for audio samples. However, in scenarios where you need to use audio files instead of microphone input, simply change the following line:
151
+
The examples in this article use the default device microphone as input for audio samples. In scenarios where you need to use audio files instead of microphone input, change the following line:
153
152
154
153
```
155
154
auto audio_config = Audio::AudioConfig::FromDefaultMicrophoneInput();
@@ -161,4 +160,4 @@ to:
161
160
auto audio_config = Audio::AudioConfig::FromWavFileInput(path/to/your/file.wav);
162
161
```
163
162
164
-
Or replace any use of `audio_config` with [Audio::AudioConfig::FromWavFileInput](/cpp/cognitive-services/speech/audio-audioconfig#fromwavfileinput). You can also have mixed inputs, using a microphone for enrollment and files for verification, for example.
163
+
Or replace any use of `audio_config` with [Audio::AudioConfig::FromWavFileInput](/cpp/cognitive-services/speech/audio-audioconfig#fromwavfileinput). You can also have mixed inputs by using a microphone for enrollment and files for verification, for example.
0 commit comments