You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/common/rest.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,4 +7,4 @@ ms.topic: include
7
7
ms.author: eur
8
8
---
9
9
10
-
[Speech-to-text REST API v3.0 reference](https://westus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0) | [Speech-to-text REST API for short audio reference](../../rest-speech-to-text-short.md) | [Additional Samples on GitHub](https://github.com/Azure-Samples/cognitive-services-quickstart-code)
10
+
[Speech-to-text REST API v3.0 reference](https://westus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0) | [Speech-to-text REST API for short audio reference](../../rest-speech-to-text-short.md) | [Additional Samples on GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk)
auto audio_config = Audio::AudioConfig::FromDefaultMicrophoneInput();
42
+
auto ticks_per_second = 10000000;
43
+
```
31
44
32
45
## Create a speech configuration
33
46
34
47
To call the Speech service by using the Speech SDK, create a [`SpeechConfig`](/cpp/cognitive-services/speech/speechconfig) class. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
This function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method. There are three [types](/cpp/cognitive-services/speech/microsoft-cognitiveservices-speech-namespace#enum-voiceprofiletype) of `VoiceProfile`:
49
82
@@ -59,15 +92,44 @@ You then call two helper functions that you'll define next, `AddEnrollmentsToTex
59
92
60
93
Define the following function to enroll a voice profile:
std::cout << "No passphrases received, enrollment not attempted.\n\n";
112
+
}
113
+
}
114
+
std::cout << "Enrollment completed.\n\n";
115
+
}
116
+
```
63
117
64
118
In this function, you enroll audio samples in a `while` loop that tracks the number of samples remaining, and that are required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak the passphrase into your microphone, and it adds the sample to the voice profile.
shared_ptr<SpeakerVerificationModel> model = SpeakerVerificationModel::FromProfile(profile);
128
+
std::cout << "Speak the passphrase to verify: \"My voice is my passport, verify me.\"\n";
129
+
shared_ptr<SpeakerRecognitionResult> result = recognizer->RecognizeOnceAsync(model).get();
130
+
std::cout << "Verified voice profile for speaker: " << result->ProfileId << ". Score is: " << result->GetScore() << ".\n\n";
131
+
}
132
+
```
71
133
72
134
In this function, you create a [SpeakerVerificationModel](/cpp/cognitive-services/speech/speaker-speakerverificationmodel) object with the [SpeakerVerificationModel::FromProfile](/cpp/cognitive-services/speech/speaker-speakerverificationmodel#fromprofile) method, passing in the [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object you created earlier.
73
135
@@ -81,7 +143,19 @@ In contrast to *text-dependent* verification, *text-independent* verification do
81
143
82
144
Start by creating the `TextIndependentVerification` function:
Like the `TextDependentVerification` function, this function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method.
87
161
@@ -93,7 +167,28 @@ You then call two helper functions: `AddEnrollmentsToTextIndependentProfile`, wh
93
167
94
168
Define the following function to enroll a voice profile:
std::cout << "No activation phrases received, enrollment not attempted.\n\n";
187
+
}
188
+
}
189
+
std::cout << "Enrollment completed.\n\n";
190
+
}
191
+
```
97
192
98
193
In this function, you enroll audio samples in a `while` loop that tracks the number of seconds of audio remaining, and that are required, for enrollment. In each iteration, [EnrollProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#enrollprofileasync) prompts you to speak into your microphone, and it adds the sample to the voice profile.
99
194
@@ -105,7 +200,19 @@ Speaker identification is used to determine *who* is speaking from a given group
105
200
106
201
Start by creating the `TextIndependentIdentification` function:
Like the `TextDependentVerification` and `TextIndependentVerification` functions, this function creates a [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) object with the [CreateProfileAsync](/cpp/cognitive-services/speech/speaker-voiceprofileclient#createprofileasync) method.
111
218
@@ -117,7 +224,16 @@ You then call two helper functions: `AddEnrollmentsToTextIndependentProfile`, wh
shared_ptr<SpeakerIdentificationModel> model = SpeakerIdentificationModel::FromProfiles({ profile });
231
+
// Note: We need at least four seconds of audio after pauses are subtracted.
232
+
std::cout << "Please speak for at least ten seconds to identify who it is from your list of enrolled speakers.\n";
233
+
shared_ptr<SpeakerRecognitionResult> result = recognizer->RecognizeOnceAsync(model).get();
234
+
std::cout << "The most similar voice profile is: " << result->ProfileId << " with similarity score: " << result->GetScore() << ".\n\n";
235
+
}
236
+
```
121
237
122
238
In this function, you create a [SpeakerIdentificationModel](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel) object with the [SpeakerIdentificationModel::FromProfiles](/cpp/cognitive-services/speech/speaker-speakeridentificationmodel#fromprofiles) method. `SpeakerIdentificationModel::FromProfiles` accepts a list of [VoiceProfile](/cpp/cognitive-services/speech/speaker-voiceprofile) objects. In this case, you pass in the `VoiceProfile` object you created earlier. If you want, you can pass in multiple `VoiceProfile` objects, each enrolled with audio samples from a different voice.
This function calls the functions you defined previously. First, it creates a [VoiceProfileClient](/cpp/cognitive-services/speech/speaker-voiceprofileclient) object and a [SpeakerRecognizer](/cpp/cognitive-services/speech/speaker-speakerrecognizer) object.
133
260
134
-
```
261
+
262
+
```cpp
135
263
auto speech_config = GetSpeechConfig();
136
264
auto client = VoiceProfileClient::FromConfig(speech_config);
137
265
auto recognizer = SpeakerRecognizer::FromConfig(speech_config, audio_config);
@@ -143,14 +271,14 @@ The `VoiceProfileClient` object is used to create, enroll, and delete voice prof
143
271
144
272
The examples in this article use the default device microphone as input for audio samples. In scenarios where you need to use audio files instead of microphone input, change the following line:
145
273
146
-
```
274
+
```cpp
147
275
auto audio_config = Audio::AudioConfig::FromDefaultMicrophoneInput();
148
276
```
149
277
150
278
to:
151
279
152
-
```
153
-
auto audio_config = Audio::AudioConfig::FromWavFileInput(path/to/your/file.wav);
280
+
```cpp
281
+
auto audio_config = Audio::AudioConfig::FromWavFileInput("path/to/your/file.wav");
154
282
```
155
283
156
284
Or replace any use of `audio_config` with [Audio::AudioConfig::FromWavFileInput](/cpp/cognitive-services/speech/audio-audioconfig#fromwavfileinput). You can also have mixed inputs by using a microphone for enrollment and files for verification, for example.
0 commit comments