Skip to content

Commit 88ba289

Browse files
committed
speech translation LID and code style
1 parent 017a597 commit 88ba289

File tree

5 files changed

+183
-372
lines changed

5 files changed

+183
-372
lines changed

articles/cognitive-services/Speech-Service/includes/how-to/translate-speech/cpp.md

Lines changed: 56 additions & 144 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ For more information on environment variables, see [Environment variables and ap
2323

2424
## Create a speech translation configuration
2525

26-
To call the Speech service by using the Speech SDK, you need to create a [`SpeechTranslationConfig`][config] instance. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
26+
To call the Speech service by using the Speech SDK, you need to create a [`SpeechTranslationConfig`][speechtranslationconfig] instance. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
2727

2828
> [!TIP]
2929
> Regardless of whether you're performing speech recognition, speech synthesis, translation, or intent recognition, you'll always create a configuration.
@@ -42,7 +42,7 @@ auto SPEECH__SUBSCRIPTION__KEY = getenv("SPEECH__SUBSCRIPTION__KEY");
4242
auto SPEECH__SERVICE__REGION = getenv("SPEECH__SERVICE__REGION");
4343

4444
void translateSpeech() {
45-
auto config =
45+
auto speechTranslationConfig =
4646
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
4747
}
4848

@@ -59,11 +59,11 @@ One common task of speech translation is specifying the input (or source) langua
5959
6060
```cpp
6161
void translateSpeech() {
62-
auto translationConfig =
62+
auto speechTranslationConfig =
6363
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
6464
6565
// Source (input) language
66-
translationConfig->SetSpeechRecognitionLanguage("it-IT");
66+
speechTranslationConfig->SetSpeechRecognitionLanguage("it-IT");
6767
}
6868
```
6969

@@ -75,38 +75,37 @@ Another common task of speech translation is to specify target translation langu
7575

7676
```cpp
7777
void translateSpeech() {
78-
auto translationConfig =
78+
auto speechTranslationConfig =
7979
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
8080

81-
translationConfig->SetSpeechRecognitionLanguage("it-IT");
81+
speechTranslationConfig->SetSpeechRecognitionLanguage("it-IT");
8282

83-
// Translate to languages. See https://aka.ms/speech/sttt-languages
84-
translationConfig->AddTargetLanguage("fr");
85-
translationConfig->AddTargetLanguage("de");
83+
speechTranslationConfig->AddTargetLanguage("fr");
84+
speechTranslationConfig->AddTargetLanguage("de");
8685
}
8786
```
8887

8988
With every call to [`AddTargetLanguage`][addlang], a new target translation language is specified. In other words, when speech is recognized from the source language, each target translation is available as part of the resulting translation operation.
9089

9190
## Initialize a translation recognizer
9291

93-
After you've created a [`SpeechTranslationConfig`][config] instance, the next step is to initialize [`TranslationRecognizer`][recognizer]. When you initialize `TranslationRecognizer`, you need to pass it your `translationConfig` instance. The configuration object provides the credentials that the Speech service requires to validate your request.
92+
After you've created a [`SpeechTranslationConfig`][speechtranslationconfig] instance, the next step is to initialize [`TranslationRecognizer`][translationrecognizer]. When you initialize `TranslationRecognizer`, you need to pass it your `translationConfig` instance. The configuration object provides the credentials that the Speech service requires to validate your request.
9493

9594
If you're recognizing speech by using your device's default microphone, here's what `TranslationRecognizer` should look like:
9695

9796
```cpp
9897
void translateSpeech() {
99-
auto translationConfig =
98+
auto speechTranslationConfig =
10099
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
101100

102101
auto fromLanguage = "en-US";
103102
auto toLanguages = { "it", "fr", "de" };
104-
translationConfig->SetSpeechRecognitionLanguage(fromLanguage);
103+
speechTranslationConfig->SetSpeechRecognitionLanguage(fromLanguage);
105104
for (auto language : toLanguages) {
106-
translationConfig->AddTargetLanguage(language);
105+
speechTranslationConfig->AddTargetLanguage(language);
107106
}
108107

109-
auto recognizer = TranslationRecognizer::FromConfig(translationConfig);
108+
auto translationRecognizer = TranslationRecognizer::FromConfig(translationConfig);
110109
}
111110
```
112111

@@ -119,37 +118,37 @@ First, reference the `AudioConfig` object as follows:
119118

120119
```cpp
121120
void translateSpeech() {
122-
auto translationConfig =
121+
auto speechTranslationConfig =
123122
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
124123

125124
auto fromLanguage = "en-US";
126125
auto toLanguages = { "it", "fr", "de" };
127-
translationConfig->SetSpeechRecognitionLanguage(fromLanguage);
126+
speechTranslationConfig->SetSpeechRecognitionLanguage(fromLanguage);
128127
for (auto language : toLanguages) {
129-
translationConfig->AddTargetLanguage(language);
128+
speechTranslationConfig->AddTargetLanguage(language);
130129
}
131130

132131
auto audioConfig = AudioConfig::FromDefaultMicrophoneInput();
133-
auto recognizer = TranslationRecognizer::FromConfig(translationConfig, audioConfig);
132+
auto translationRecognizer = TranslationRecognizer::FromConfig(translationConfig, audioConfig);
134133
}
135134
```
136135

137136
If you want to provide an audio file instead of using a microphone, you still need to provide an `audioConfig` parameter. However, when you create an `AudioConfig` class instance, instead of calling `FromDefaultMicrophoneInput`, you call `FromWavFileInput` and pass the `filename` parameter:
138137

139138
```cpp
140139
void translateSpeech() {
141-
auto translationConfig =
140+
auto speechTranslationConfig =
142141
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
143142

144143
auto fromLanguage = "en-US";
145144
auto toLanguages = { "it", "fr", "de" };
146-
translationConfig->SetSpeechRecognitionLanguage(fromLanguage);
145+
speechTranslationConfig->SetSpeechRecognitionLanguage(fromLanguage);
147146
for (auto language : toLanguages) {
148-
translationConfig->AddTargetLanguage(language);
147+
speechTranslationConfig->AddTargetLanguage(language);
149148
}
150149

151150
auto audioConfig = AudioConfig::FromWavFileInput("YourAudioFile.wav");
152-
auto recognizer = TranslationRecognizer::FromConfig(translationConfig, audioConfig);
151+
auto translationRecognizer = TranslationRecognizer::FromConfig(translationConfig, audioConfig);
153152
}
154153
```
155154

@@ -159,20 +158,20 @@ To translate speech, the Speech SDK relies on a microphone or an audio file inpu
159158

160159
```cpp
161160
void translateSpeech() {
162-
auto translationConfig =
161+
auto speechTranslationConfig =
163162
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
164163

165164
string fromLanguage = "en-US";
166165
string toLanguages[3] = { "it", "fr", "de" };
167-
translationConfig->SetSpeechRecognitionLanguage(fromLanguage);
166+
speechTranslationConfig->SetSpeechRecognitionLanguage(fromLanguage);
168167
for (auto language : toLanguages) {
169-
translationConfig->AddTargetLanguage(language);
168+
speechTranslationConfig->AddTargetLanguage(language);
170169
}
171170

172-
auto recognizer = TranslationRecognizer::FromConfig(translationConfig);
171+
auto translationRecognizer = TranslationRecognizer::FromConfig(translationConfig);
173172
cout << "Say something in '" << fromLanguage << "' and we'll translate...\n";
174173

175-
auto result = recognizer->RecognizeOnceAsync().get();
174+
auto result = translationRecognizer->RecognizeOnceAsync().get();
176175
if (result->Reason == ResultReason::TranslatedSpeech)
177176
{
178177
cout << "Recognized: \"" << result->Text << "\"" << std::endl;
@@ -196,26 +195,25 @@ After a successful speech recognition and translation, the result contains all t
196195

197196
The `TranslationRecognizer` object exposes a `Synthesizing` event. The event fires several times and provides a mechanism to retrieve the synthesized audio from the translation recognition result. If you're translating to multiple languages, see [Manual synthesis](#manual-synthesis).
198197

199-
Specify the synthesis voice by assigning a [`SetVoiceName`][voicename] instance, and provide an event handler for the `Synthesizing` event to get the audio. The following example saves the translated audio as a .wav file.
198+
Specify the synthesis voice by assigning a [`SetVoiceName`][setvoicename] instance, and provide an event handler for the `Synthesizing` event to get the audio. The following example saves the translated audio as a .wav file.
200199

201200
> [!IMPORTANT]
202-
> The event-based synthesis works only with a single translation. *Do not* add multiple target translation languages. Additionally, the [`SetVoiceName`][voicename] value should be the same language as the target translation language. For example, `"de"` could map to `"de-DE-Hedda"`.
201+
> The event-based synthesis works only with a single translation. *Do not* add multiple target translation languages. Additionally, the [`SetVoiceName`][setvoicename] value should be the same language as the target translation language. For example, `"de"` could map to `"de-DE-Hedda"`.
203202
204203
```cpp
205204
void translateSpeech() {
206-
auto translationConfig =
205+
auto speechTranslationConfig =
207206
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
208207

209208
auto fromLanguage = "en-US";
210209
auto toLanguage = "de";
211-
translationConfig->SetSpeechRecognitionLanguage(fromLanguage);
212-
translationConfig->AddTargetLanguage(toLanguage);
210+
speechTranslationConfig->SetSpeechRecognitionLanguage(fromLanguage);
211+
speechTranslationConfig->AddTargetLanguage(toLanguage);
213212

214-
// See: https://aka.ms/speech/sdkregion#standard-and-neural-voices
215-
translationConfig->SetVoiceName("de-DE-Hedda");
213+
speechTranslationConfig->SetVoiceName("de-DE-Hedda");
216214

217-
auto recognizer = TranslationRecognizer::FromConfig(translationConfig);
218-
recognizer->Synthesizing.Connect([](const TranslationSynthesisEventArgs& e)
215+
auto translationRecognizer = TranslationRecognizer::FromConfig(translationConfig);
216+
translationRecognizer->Synthesizing.Connect([](const TranslationSynthesisEventArgs& e)
219217
{
220218
auto audio = e.Result->Audio;
221219
auto size = audio.size();
@@ -231,7 +229,7 @@ void translateSpeech() {
231229

232230
cout << "Say something in '" << fromLanguage << "' and we'll translate...\n";
233231

234-
auto result = recognizer->RecognizeOnceAsync().get();
232+
auto result = translationRecognizer->RecognizeOnceAsync().get();
235233
if (result->Reason == ResultReason::TranslatedSpeech)
236234
{
237235
cout << "Recognized: \"" << result->Text << "\"" << std::endl;
@@ -253,24 +251,23 @@ The following example translates to five languages. Each translation is then syn
253251

254252
```cpp
255253
void translateSpeech() {
256-
auto translationConfig =
254+
auto speechTranslationConfig =
257255
SpeechTranslationConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
258256

259257
auto fromLanguage = "en-US";
260258
auto toLanguages = { "de", "en", "it", "pt", "zh-Hans" };
261-
translationConfig->SetSpeechRecognitionLanguage(fromLanguage);
259+
speechTranslationConfig->SetSpeechRecognitionLanguage(fromLanguage);
262260
for (auto language : toLanguages) {
263-
translationConfig->AddTargetLanguage(language);
261+
speechTranslationConfig->AddTargetLanguage(language);
264262
}
265263

266-
auto recognizer = TranslationRecognizer::FromConfig(translationConfig);
264+
auto translationRecognizer = TranslationRecognizer::FromConfig(translationConfig);
267265

268266
cout << "Say something in '" << fromLanguage << "' and we'll translate...\n";
269267

270-
auto result = recognizer->RecognizeOnceAsync().get();
268+
auto result = translationRecognizer->RecognizeOnceAsync().get();
271269
if (result->Reason == ResultReason::TranslatedSpeech)
272270
{
273-
// See: https://aka.ms/speech/sdkregion#standard-and-neural-voices
274271
map<string, string> languageToVoiceMap;
275272
languageToVoiceMap["de"] = "de-DE-KatjaNeural";
276273
languageToVoiceMap["en"] = "en-US-AriaNeural";
@@ -285,14 +282,14 @@ void translateSpeech() {
285282
auto translation = pair.second;
286283
cout << "Translated into '" << language << "': " << translation << std::endl;
287284

288-
auto speech_config =
285+
auto speechConfig =
289286
SpeechConfig::FromSubscription(SPEECH__SUBSCRIPTION__KEY, SPEECH__SERVICE__REGION);
290-
speech_config->SetSpeechSynthesisVoiceName(languageToVoiceMap[language]);
287+
speechConfig->SetSpeechSynthesisVoiceName(languageToVoiceMap[language]);
291288

292-
auto audio_config = AudioConfig::FromWavFileOutput(language + "-translation.wav");
293-
auto synthesizer = SpeechSynthesizer::FromConfig(speech_config, audio_config);
289+
auto audioConfig = AudioConfig::FromWavFileOutput(language + "-translation.wav");
290+
auto speechSynthesizer = SpeechSynthesizer::FromConfig(speechConfig, audioConfig);
294291

295-
synthesizer->SpeakTextAsync(translation).get();
292+
speechSynthesizer->SpeakTextAsync(translation).get();
296293
}
297294
}
298295
}
@@ -302,109 +299,24 @@ For more information about speech synthesis, see [the basics of speech synthesis
302299

303300
## Multilingual translation with language identification
304301

305-
In many scenarios, you might not know which input languages to specify. Using language identification allows you to specify up to 10 possible input languages and automatically translate to your target languages.
302+
In many scenarios, you might not know which input languages to specify. Using language identification you can detect up to 10 possible input languages and automatically translate to your target languages.
306303

307-
The following example uses continuous translation from an audio file. It automatically detects the input language, even if the language being spoken is changing. When you run the sample, `en-US` and `zh-CN` will be automatically detected because they're defined in `AutoDetectSourceLanguageConfig`. Then, the speech will be translated to `de` and `fr` as specified in the calls to `AddTargetLanguage()`.
308-
309-
> [!IMPORTANT]
310-
> This feature is currently in **preview**.
304+
The following example uses continuous translation from an audio file. When you run the sample, `en-US` and `zh-CN` will be automatically detected because they're defined in `AutoDetectSourceLanguageConfig`. Then, the speech will be translated to `de` and `fr` as specified in the calls to `AddTargetLanguage()`.
311305

312306
```cpp
313-
using namespace std;
314-
using namespace Microsoft::CognitiveServices::Speech;
315-
using namespace Microsoft::CognitiveServices::Speech::Audio;
316-
317-
void MultiLingualTranslation()
318-
{
319-
auto region = "<paste-your-region>";
320-
// Currently, the v2 endpoint is required for this design pattern
321-
auto endpointString = std::format("wss://{}.stt.speech.microsoft.com/speech/universal/v2", region);
322-
auto config = SpeechConfig::FromEndpoint(endpointString, "<paste-your-subscription-key>");
323-
324-
config->SetProperty(PropertyId::SpeechServiceConnection_ContinuousLanguageIdPriority, "Latency");
325-
auto autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "zh-CN" });
326-
327-
promise<void> recognitionEnd;
328-
// Source language is required, but is currently NoOp
329-
auto fromLanguage = "en-US";
330-
config->SetSpeechRecognitionLanguage(fromLanguage);
331-
config->AddTargetLanguage("de");
332-
config->AddTargetLanguage("fr");
333-
334-
auto audioInput = AudioConfig::FromWavFileInput("path-to-your-audio-file.wav");
335-
auto recognizer = TranslationRecognizer::FromConfig(config, autoDetectSourceLanguageConfig, audioInput);
336-
337-
recognizer->Recognizing.Connect([](const TranslationRecognitionEventArgs& e)
338-
{
339-
std::string lidResult = e.Result->Properties.GetProperty(PropertyId::SpeechServiceConnection_AutoDetectSourceLanguageResult);
340-
341-
cout << "Recognizing in Language = "<< lidResult << ":" << e.Result->Text << std::endl;
342-
for (const auto& it : e.Result->Translations)
343-
{
344-
cout << " Translated into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
345-
}
346-
});
347-
348-
recognizer->Recognized.Connect([](const TranslationRecognitionEventArgs& e)
349-
{
350-
if (e.Result->Reason == ResultReason::TranslatedSpeech)
351-
{
352-
std::string lidResult = e.Result->Properties.GetProperty(PropertyId::SpeechServiceConnection_AutoDetectSourceLanguageResult);
353-
cout << "RECOGNIZED in Language = " << lidResult << ": Text=" << e.Result->Text << std::endl;
354-
}
355-
else if (e.Result->Reason == ResultReason::RecognizedSpeech)
356-
{
357-
cout << "RECOGNIZED: Text=" << e.Result->Text << " (text could not be translated)" << std::endl;
358-
}
359-
else if (e.Result->Reason == ResultReason::NoMatch)
360-
{
361-
cout << "NOMATCH: Speech could not be recognized." << std::endl;
362-
}
363-
364-
for (const auto& it : e.Result->Translations)
365-
{
366-
cout << " Translated into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
367-
}
368-
});
369-
370-
recognizer->Canceled.Connect([&recognitionEnd](const TranslationRecognitionCanceledEventArgs& e)
371-
{
372-
cout << "CANCELED: Reason=" << (int)e.Reason << std::endl;
373-
if (e.Reason == CancellationReason::Error)
374-
{
375-
cout << "CANCELED: ErrorCode=" << (int)e.ErrorCode << std::endl;
376-
cout << "CANCELED: ErrorDetails=" << e.ErrorDetails << std::endl;
377-
cout << "CANCELED: Did you set the speech resource key and region values?" << std::endl;
378-
379-
recognitionEnd.set_value();
380-
}
381-
});
382-
383-
recognizer->Synthesizing.Connect([](const TranslationSynthesisEventArgs& e)
384-
{
385-
auto size = e.Result->Audio.size();
386-
cout << "Translation synthesis result: size of audio data: " << size
387-
<< (size == 0 ? "(END)" : "");
388-
});
389-
390-
recognizer->SessionStopped.Connect([&recognitionEnd](const SessionEventArgs& e)
391-
{
392-
cout << "Session stopped.";
393-
recognitionEnd.set_value();
394-
});
395-
396-
// Start continuous recognition. Use StopContinuousRecognitionAsync() to stop recognition.
397-
recognizer->StartContinuousRecognitionAsync().get();
398-
recognitionEnd.get_future().get();
399-
recognizer->StopContinuousRecognitionAsync().get();
400-
}
307+
speechTranslationConfig->AddTargetLanguage("de");
308+
speechTranslationConfig->AddTargetLanguage("fr");
309+
auto autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "zh-CN" });
310+
auto translationRecognizer = TranslationRecognizer::FromConfig(speechTranslationConfig, autoDetectSourceLanguageConfig, audioConfig);
401311
```
402312
403-
[config]: /cpp/cognitive-services/speech/translation-speechtranslationconfig
313+
For the complete code sample, see [language identification](../../../language-identification.md?pivots=programming-language-cpp#speech-translation).
314+
315+
[speechtranslationconfig]: /cpp/cognitive-services/speech/translation-speechtranslationconfig
404316
[audioconfig]: /cpp/cognitive-services/speech/audio-audioconfig
405-
[recognizer]: /cpp/cognitive-services/speech/translation-translationrecognizer
317+
[translationrecognizer]: /cpp/cognitive-services/speech/translation-translationrecognizer
406318
[recognitionlang]: /cpp/cognitive-services/speech/speechconfig#setspeechrecognitionlanguage
407319
[addlang]: /cpp/cognitive-services/speech/translation-speechtranslationconfig#addtargetlanguage
408320
[translations]: /cpp/cognitive-services/speech/translation-translationrecognitionresult#translations
409-
[voicename]: /cpp/cognitive-services/speech/translation-speechtranslationconfig#setvoicename
321+
[setvoicename]: /cpp/cognitive-services/speech/translation-speechtranslationconfig#setvoicename
410322
[speechsynthesisvoicename]: /cpp/cognitive-services/speech/speechconfig#setspeechsynthesisvoicename

0 commit comments

Comments
 (0)