You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For more information about speech to text, see [the basics of speech recognition](../../../get-started-speech-to-text.md).
199
199
200
+
## Event based translation
201
+
202
+
The `TranslationRecognizer` object exposes a `Recognizing` event. The event fires several times and provides a mechanism to retrieve the intermediate translation results.
203
+
204
+
> [!NOTE]
205
+
> Intermediate translation results aren't available when you use [multi-lingual speech translation](#multi-lingual-speech-translation-without-source-language-candidates).
206
+
207
+
The following example prints the intermediate translation results to the console:
208
+
209
+
```csharp
210
+
using (varaudioInput=AudioConfig.FromWavFileInput(@"whatstheweatherlike.wav"))
211
+
{
212
+
using (vartranslationRecognizer=newTranslationRecognizer(config, audioInput))
213
+
{
214
+
// Subscribes to events.
215
+
translationRecognizer.Recognizing+= (s, e) =>
216
+
{
217
+
Console.WriteLine($"RECOGNIZING in '{fromLanguage}': Text={e.Result.Text}");
218
+
foreach (varelementine.Result.Translations)
219
+
{
220
+
Console.WriteLine($" TRANSLATING into '{element.Key}': {element.Value}");
221
+
}
222
+
};
223
+
224
+
translationRecognizer.Recognized+= (s, e) => {
225
+
if (e.Result.Reason==ResultReason.TranslatedSpeech)
226
+
{
227
+
Console.WriteLine($"RECOGNIZED in '{fromLanguage}': Text={e.Result.Text}");
228
+
foreach (varelementine.Result.Translations)
229
+
{
230
+
Console.WriteLine($" TRANSLATED into '{element.Key}': {element.Value}");
After a successful speech recognition and translation, the result contains all the translations in a dictionary. The [`Translations`][translations] dictionary key is the target translation language, and the value is the translated text. Recognized speech can be translated and then synthesized in a different language (speech-to-speech).
@@ -314,11 +372,40 @@ The following example anticipates that `en-US` or `zh-CN` should be detected bec
For a complete code sample, see [language identification](../../../language-identification.md?pivots=programming-language-csharp#run-speech-translation).
321
379
380
+
381
+
## Multi-lingual speech translation without source language candidates
382
+
383
+
Multi-lingual speech translation implements a new level of speech translation technology that unlocks various capabilities, including having no specified input language, and handling language switches within the same session. These features enable a new level of speech translation powers that can be implemented into your products.
384
+
385
+
Currently when you use Language ID with speech translation, you must create the `SpeechTranslationConfig` object from the v2 endpoint. Replace the string "YourServiceRegion" with your Speech resource region (such as "westus"). Replace "YourSubscriptionKey" with your Speech resource key.
Specify the translation target languages. Replace with languages of your choice. You can add more lines.
394
+
```csharp
395
+
config.AddTargetLanguage("de");
396
+
config.AddTargetLanguage("fr");
397
+
```
398
+
399
+
A key differentiator with multi-lingual speech translation is that you do not need to specify the source language. This is because the service will automatically detect the source language. Create the `AutoDetectSourceLanguageConfig` object with the `fromOpenRange` method to let the service know that you want to use multi-lingual speech translation with no specified source language.
For a complete code sample with the Speech SDK, see [speech translation samples on GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/translation_samples.cs#L472).
Automatic multi-lingual speech translation is available in public preview. This innovative feature revolutionizes the way language barriers are overcome, offering unparalleled capabilities for seamless communication across diverse linguistic landscapes.
14
+
15
+
##### Key Highlights
16
+
17
+
- Unspecified input language: Multi-lingual speech translation can receive audio in a wide range of languages, and there's no need to specify what the expected input language is. It makes it an invaluable feature to understand and collaborate across global contexts without the need for presetting.
18
+
- Language switching: Multi-lingual speech translation allows for multiple languages to be spoken during the same session, and have them all translated into the same target language. There's no need to restart a session when the input language changes or any other actions by you.
19
+
20
+
##### How it works
21
+
22
+
- Travel interpreter: multi-lingual speech translation can enhance the experience of tourists visiting foreign destinations by providing them with information and assistance in their preferred language. Hotel concierge services, guided tours, and visitor centers can utilize this technology to cater to diverse linguistic needs.
23
+
- International conferences: multi-lingual speech translation can facilitate communication among participants from different regions who might speak various languages using live translated caption. Attendees can speak in their native languages without needing to specify them, ensuring seamless understanding and collaboration.
24
+
- Educational meetings: In multi-cultural classrooms or online learning environments, multi-lingual speech translation can support language diversity among students and teachers. It allows for seamless communication and participation without the need to specify each student's or instructor's language.
25
+
26
+
##### How to access
27
+
28
+
For a detailed introduction, visit [Speech translation overview](../../speech-translation.md). Additionally, you can refer to the code samples at [how to translate speech](../../how-to-translate-speech.md). This new feature is fully supported by all SDK versions from 1.37.0 onwards.
29
+
11
30
#### Real-time speech to text with diariazation (GA)
12
31
13
32
Real-time speech to text with diariazation is now generally available.
14
33
15
-
Check out [Real-time diarization quickstart](../../get-started-stt-diarization.md) to learn more about how to create speech to text applications that use diarization to distinguish between the different speakers who participate in the conversation.
34
+
You can create speech to text applications that use diarization to distinguish between the different speakers who participate in the conversation. For more information about real-time diarization, Check out the [real-time diarization quickstart](../../get-started-stt-diarization.md).
16
35
17
-
#### Speech to Text model Update
36
+
#### Speech to text model Update
18
37
19
-
[Real-time Speech to Text](../../how-to-recognize-speech.md) has released new models with bilingual capabilities. The `en-IN` model now support both English and Hindi bilingual scenarios and offers improved accuracy. Arabic locales (`ar-AE`, `ar-BH`, `ar-DZ`, `ar-IL`, `ar-IQ`, `ar-KW`, `ar-LB`, `ar-LY`, `ar-MA`, `ar-OM`, `ar-PS`, `ar-QA`, `ar-SA`, `ar-SY`, `ar-TN`, `ar-YE`) are now equipped with bilingual support for English, enhanced accuracy and call center support.
38
+
[Real-time speech to text](../../how-to-recognize-speech.md) has released new models with bilingual capabilities. The `en-IN` model now supports both English and Hindi bilingual scenarios and offers improved accuracy. Arabic locales (`ar-AE`, `ar-BH`, `ar-DZ`, `ar-IL`, `ar-IQ`, `ar-KW`, `ar-LB`, `ar-LY`, `ar-MA`, `ar-OM`, `ar-PS`, `ar-QA`, `ar-SA`, `ar-SY`, `ar-TN`, `ar-YE`) are now equipped with bilingual support for English, enhanced accuracy and call center support.
20
39
21
-
[Batch transcription](../../batch-transcription.md)has launched models with new architecture for `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, `zh-CN`. These models significantly enhance readability and entity recognition.
40
+
[Batch transcription](../../batch-transcription.md)provides models with new architecture for these locales: `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, and`zh-CN`. These models significantly enhance readability and entity recognition.
22
41
23
42
### March 2024 release
24
43
@@ -82,7 +101,7 @@ How to Use:
82
101
83
102
Choose es-US (Spanish and English) or fr-CA (French and English) when you call the Speech Service API or try it out on Speech Studio. Feel free to speak either language or mix them together—the model is designed to adapt dynamically, providing accurate and context-aware responses in both languages.
84
103
85
-
It's time to elevate your communication game with our latest feature release—seamless, multilingual communication at your fingertips!
104
+
It's time to elevate your communication game with our latest feature release—seamless, multi-lingual communication at your fingertips!
0 commit comments