Skip to content

Commit 0a19304

Browse files
Merge pull request #229144 from eric-urban/eur/lid-updates
speech translation LID and code style
2 parents aea4f18 + 054a7f8 commit 0a19304

File tree

15 files changed

+355
-509
lines changed

15 files changed

+355
-509
lines changed

articles/cognitive-services/Speech-Service/includes/how-to/recognize-speech/cpp.md

Lines changed: 22 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Create a `SpeechConfig` instance by using your key and region. Create a Speech r
2020
using namespace std;
2121
using namespace Microsoft::CognitiveServices::Speech;
2222

23-
auto config = SpeechConfig::FromSubscription("YourSpeechKey", "YourSpeechRegion");
23+
auto speechConfig = SpeechConfig::FromSubscription("YourSpeechKey", "YourSpeechRegion");
2424
```
2525
2626
You can initialize `SpeechConfig` in a few other ways:
@@ -40,10 +40,10 @@ To recognize speech by using your device microphone, create an [`AudioConfig`](/
4040
using namespace Microsoft::CognitiveServices::Speech::Audio;
4141
4242
auto audioConfig = AudioConfig::FromDefaultMicrophoneInput();
43-
auto recognizer = SpeechRecognizer::FromConfig(config, audioConfig);
43+
auto speechRecognizer = SpeechRecognizer::FromConfig(config, audioConfig);
4444
4545
cout << "Speak into your microphone." << std::endl;
46-
auto result = recognizer->RecognizeOnceAsync().get();
46+
auto result = speechRecognizer->RecognizeOnceAsync().get();
4747
cout << "RECOGNIZED: Text=" << result->Text << std::endl;
4848
```
4949

@@ -56,10 +56,10 @@ If you want to recognize speech from an audio file instead of using a microphone
5656
```cpp
5757
using namespace Microsoft::CognitiveServices::Speech::Audio;
5858

59-
auto audioInput = AudioConfig::FromWavFileInput("YourAudioFile.wav");
60-
auto recognizer = SpeechRecognizer::FromConfig(config, audioInput);
59+
auto audioConfig = AudioConfig::FromWavFileInput("YourAudioFile.wav");
60+
auto speechRecognizer = SpeechRecognizer::FromConfig(config, audioConfig);
6161

62-
auto result = recognizer->RecognizeOnceAsync().get();
62+
auto result = speechRecognizer->RecognizeOnceAsync().get();
6363
cout << "RECOGNIZED: Text=" << result->Text << std::endl;
6464
```
6565
@@ -72,7 +72,7 @@ The [Recognizer class](/cpp/cognitive-services/speech/speechrecognizer) for the
7272
Single-shot recognition asynchronously recognizes a single utterance. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed. Here's an example of asynchronous single-shot recognition via [`RecognizeOnceAsync`](/cpp/cognitive-services/speech/speechrecognizer#recognizeonceasync):
7373
7474
```cpp
75-
auto result = recognizer->RecognizeOnceAsync().get();
75+
auto result = speechRecognizer->RecognizeOnceAsync().get();
7676
```
7777

7878
You need to write some code to handle the result. This sample evaluates [`result->Reason`](/cpp/cognitive-services/speech/recognitionresult#reason) and:
@@ -114,8 +114,8 @@ Continuous recognition is a bit more involved than single-shot recognition. It r
114114
Start by defining the input and initializing [`SpeechRecognizer`](/cpp/cognitive-services/speech/speechrecognizer):
115115

116116
```cpp
117-
auto audioInput = AudioConfig::FromWavFileInput("YourAudioFile.wav");
118-
auto recognizer = SpeechRecognizer::FromConfig(config, audioInput);
117+
auto audioConfig = AudioConfig::FromWavFileInput("YourAudioFile.wav");
118+
auto speechRecognizer = SpeechRecognizer::FromConfig(config, audioConfig);
119119
```
120120

121121
Next, create a variable to manage the state of speech recognition. Declare `promise<void>` because at the start of recognition, you can safely assume that it's not finished:
@@ -132,12 +132,12 @@ Next, subscribe to the events that [`SpeechRecognizer`](/cpp/cognitive-services/
132132
* [`Canceled`](/cpp/cognitive-services/speech/asyncrecognizer#canceled): Signal for events that contain canceled recognition results. These results indicate a recognition attempt that was canceled as a result or a direct cancellation request. Alternatively, they indicate a transport or protocol failure.
133133

134134
```cpp
135-
recognizer->Recognizing.Connect([](const SpeechRecognitionEventArgs& e)
135+
speechRecognizer->Recognizing.Connect([](const SpeechRecognitionEventArgs& e)
136136
{
137137
cout << "Recognizing:" << e.Result->Text << std::endl;
138138
});
139139

140-
recognizer->Recognized.Connect([](const SpeechRecognitionEventArgs& e)
140+
speechRecognizer->Recognized.Connect([](const SpeechRecognitionEventArgs& e)
141141
{
142142
if (e.Result->Reason == ResultReason::RecognizedSpeech)
143143
{
@@ -150,7 +150,7 @@ recognizer->Recognized.Connect([](const SpeechRecognitionEventArgs& e)
150150
}
151151
});
152152

153-
recognizer->Canceled.Connect([&recognitionEnd](const SpeechRecognitionCanceledEventArgs& e)
153+
speechRecognizer->Canceled.Connect([&recognitionEnd](const SpeechRecognitionCanceledEventArgs& e)
154154
{
155155
cout << "CANCELED: Reason=" << (int)e.Reason << std::endl;
156156
if (e.Reason == CancellationReason::Error)
@@ -163,7 +163,7 @@ recognizer->Canceled.Connect([&recognitionEnd](const SpeechRecognitionCanceledEv
163163
}
164164
});
165165

166-
recognizer->SessionStopped.Connect([&recognitionEnd](const SessionEventArgs& e)
166+
speechRecognizer->SessionStopped.Connect([&recognitionEnd](const SessionEventArgs& e)
167167
{
168168
cout << "Session stopped.";
169169
recognitionEnd.set_value(); // Notify to stop recognition.
@@ -174,25 +174,31 @@ With everything set up, call [`StopContinuousRecognitionAsync`](/cpp/cognitive-s
174174
175175
```cpp
176176
// Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
177-
recognizer->StartContinuousRecognitionAsync().get();
177+
speechRecognizer->StartContinuousRecognitionAsync().get();
178178
179179
// Waits for recognition end.
180180
recognitionEnd.get_future().get();
181181
182182
// Stops recognition.
183-
recognizer->StopContinuousRecognitionAsync().get();
183+
speechRecognizer->StopContinuousRecognitionAsync().get();
184184
```
185185

186186
## Change the source language
187187

188188
A common task for speech recognition is specifying the input (or source) language. The following example shows how you would change the input language to German. In your code, find your [`SpeechConfig`](/cpp/cognitive-services/speech/speechconfig) instance and add this line directly below it:
189189

190190
```cpp
191-
config->SetSpeechRecognitionLanguage("de-DE");
191+
speechConfig->SetSpeechRecognitionLanguage("de-DE");
192192
```
193193
194194
[`SetSpeechRecognitionLanguage`](/cpp/cognitive-services/speech/speechconfig#setspeechrecognitionlanguage) is a parameter that takes a string as an argument. Refer to the [list of supported speech-to-text locales](../../../language-support.md?tabs=stt).
195195
196+
## Language identification
197+
198+
You can use [language identification](../../../language-identification.md?pivots=programming-language-cpp#speech-to-text) with Speech-to-text recognition when you need to identify the language in an audio source and then transcribe it to text.
199+
200+
For a complete code sample, see [language identification](../../../language-identification.md?pivots=programming-language-cpp#speech-to-text).
201+
196202
## Use a custom endpoint
197203
198204
With [Custom Speech](../../../custom-speech-overview.md), you can upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint. The following example shows how to set a custom endpoint.

articles/cognitive-services/Speech-Service/includes/how-to/recognize-speech/csharp.md

Lines changed: 26 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,10 @@ class Program
5858
async static Task FromMic(SpeechConfig speechConfig)
5959
{
6060
using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
61-
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
61+
using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
6262

6363
Console.WriteLine("Speak into your microphone.");
64-
var result = await recognizer.RecognizeOnceAsync();
64+
var result = await speechRecognizer.RecognizeOnceAsync();
6565
Console.WriteLine($"RECOGNIZED: Text={result.Text}");
6666
}
6767

@@ -91,9 +91,9 @@ class Program
9191
async static Task FromFile(SpeechConfig speechConfig)
9292
{
9393
using var audioConfig = AudioConfig.FromWavFileInput("PathToFile.wav");
94-
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
94+
using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
9595

96-
var result = await recognizer.RecognizeOnceAsync();
96+
var result = await speechRecognizer.RecognizeOnceAsync();
9797
Console.WriteLine($"RECOGNIZED: Text={result.Text}");
9898
}
9999

@@ -125,18 +125,18 @@ class Program
125125
async static Task FromStream(SpeechConfig speechConfig)
126126
{
127127
var reader = new BinaryReader(File.OpenRead("PathToFile.wav"));
128-
using var audioInputStream = AudioInputStream.CreatePushStream();
129-
using var audioConfig = AudioConfig.FromStreamInput(audioInputStream);
130-
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
128+
using var audioConfigStream = AudioInputStream.CreatePushStream();
129+
using var audioConfig = AudioConfig.FromStreamInput(audioConfigStream);
130+
using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
131131

132132
byte[] readBytes;
133133
do
134134
{
135135
readBytes = reader.ReadBytes(1024);
136-
audioInputStream.Write(readBytes, readBytes.Length);
136+
audioConfigStream.Write(readBytes, readBytes.Length);
137137
} while (readBytes.Length > 0);
138138

139-
var result = await recognizer.RecognizeOnceAsync();
139+
var result = await speechRecognizer.RecognizeOnceAsync();
140140
Console.WriteLine($"RECOGNIZED: Text={result.Text}");
141141
}
142142

@@ -185,13 +185,13 @@ switch (result.Reason)
185185

186186
The previous examples use single-shot recognition, which recognizes a single utterance. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed.
187187

188-
In contrast, you use continuous recognition when you want to control when to stop recognizing. It requires you to subscribe to the `Recognizing`, `Recognized`, and `Canceled` events to get the recognition results. To stop recognition, you must call [`StopContinuousRecognitionAsync`](/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer.stopcontinuousrecognitionasync). Here's an example of how continuous recognition is performed on an audio input file.
188+
In contrast, you use continuous recognition when you want to control when to stop recognizing. It requires you to subscribe to the `Recognizing`, `Recognized`, and `Canceled` events to get the recognition results. To stop recognition, you must call [`StopContinuousRecognitionAsync`](/dotnet/api/microsoft.cognitiveservices.speech.speechspeechRecognizer.stopcontinuousrecognitionasync). Here's an example of how continuous recognition is performed on an audio input file.
189189

190190
Start by defining the input and initializing [`SpeechRecognizer`](/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer):
191191

192192
```csharp
193193
using var audioConfig = AudioConfig.FromWavFileInput("YourAudioFile.wav");
194-
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
194+
using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
195195
```
196196

197197
Then create a `TaskCompletionSource<int>` instance to manage the state of speech recognition:
@@ -202,18 +202,18 @@ var stopRecognition = new TaskCompletionSource<int>();
202202

203203
Next, subscribe to the events that `SpeechRecognizer` sends:
204204

205-
* [`Recognizing`](/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer.recognizing): Signal for events that contain intermediate recognition results.
206-
* [`Recognized`](/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer.recognized): Signal for events that contain final recognition results, which indicate a successful recognition attempt.
207-
* [`SessionStopped`](/dotnet/api/microsoft.cognitiveservices.speech.recognizer.sessionstopped): Signal for events that indicate the end of a recognition session (operation).
208-
* [`Canceled`](/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer.canceled): Signal for events that contain canceled recognition results. These results indicate a recognition attempt that was canceled as a result or a direct cancellation request. Alternatively, they indicate a transport or protocol failure.
205+
* [`Recognizing`](/dotnet/api/microsoft.cognitiveservices.speech.speechspeechRecognizer.recognizing): Signal for events that contain intermediate recognition results.
206+
* [`Recognized`](/dotnet/api/microsoft.cognitiveservices.speech.speechspeechRecognizer.recognized): Signal for events that contain final recognition results, which indicate a successful recognition attempt.
207+
* [`SessionStopped`](/dotnet/api/microsoft.cognitiveservices.speech.speechRecognizer.sessionstopped): Signal for events that indicate the end of a recognition session (operation).
208+
* [`Canceled`](/dotnet/api/microsoft.cognitiveservices.speech.speechspeechRecognizer.canceled): Signal for events that contain canceled recognition results. These results indicate a recognition attempt that was canceled as a result or a direct cancellation request. Alternatively, they indicate a transport or protocol failure.
209209

210210
```csharp
211-
recognizer.Recognizing += (s, e) =>
211+
speechRecognizer.Recognizing += (s, e) =>
212212
{
213213
Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
214214
};
215215

216-
recognizer.Recognized += (s, e) =>
216+
speechRecognizer.Recognized += (s, e) =>
217217
{
218218
if (e.Result.Reason == ResultReason.RecognizedSpeech)
219219
{
@@ -225,7 +225,7 @@ recognizer.Recognized += (s, e) =>
225225
}
226226
};
227227

228-
recognizer.Canceled += (s, e) =>
228+
speechRecognizer.Canceled += (s, e) =>
229229
{
230230
Console.WriteLine($"CANCELED: Reason={e.Reason}");
231231

@@ -239,7 +239,7 @@ recognizer.Canceled += (s, e) =>
239239
stopRecognition.TrySetResult(0);
240240
};
241241

242-
recognizer.SessionStopped += (s, e) =>
242+
speechRecognizer.SessionStopped += (s, e) =>
243243
{
244244
Console.WriteLine("\n Session stopped event.");
245245
stopRecognition.TrySetResult(0);
@@ -249,13 +249,13 @@ recognizer.SessionStopped += (s, e) =>
249249
With everything set up, call `StartContinuousRecognitionAsync` to start recognizing:
250250

251251
```csharp
252-
await recognizer.StartContinuousRecognitionAsync();
252+
await speechRecognizer.StartContinuousRecognitionAsync();
253253

254254
// Waits for completion. Use Task.WaitAny to keep the task rooted.
255255
Task.WaitAny(new[] { stopRecognition.Task });
256256

257257
// Make the following call at some point to stop recognition:
258-
// await recognizer.StopContinuousRecognitionAsync();
258+
// await speechRecognizer.StopContinuousRecognitionAsync();
259259
```
260260

261261
## Change the source language
@@ -268,6 +268,11 @@ speechConfig.SpeechRecognitionLanguage = "it-IT";
268268

269269
The [`SpeechRecognitionLanguage`](/dotnet/api/microsoft.cognitiveservices.speech.speechconfig.speechrecognitionlanguage) property expects a language-locale format string. Refer to the [list of supported speech-to-text locales](../../../language-support.md?tabs=stt).
270270

271+
## Language identification
272+
273+
You can use [language identification](../../../language-identification.md?pivots=programming-language-csharp#speech-to-text) with Speech-to-text recognition when you need to identify the language in an audio source and then transcribe it to text.
274+
275+
For a complete code sample, see [language identification](../../../language-identification.md?pivots=programming-language-csharp#speech-to-text).
271276

272277
## Use a custom endpoint
273278

0 commit comments

Comments
 (0)