Skip to content

Commit 0bc82ed

Browse files
authored
Merge pull request #273929 from MicrosoftDocs/main
4/30/2024 PM Publish
2 parents 6717640 + 1f7b8a5 commit 0bc82ed

File tree

189 files changed

+3226
-3596
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

189 files changed

+3226
-3596
lines changed

articles/ai-services/document-intelligence/quickstarts/try-document-intelligence-studio.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ monikerRange: '>=doc-intel-3.0.0'
2828
* A [**Document Intelligence**](https://portal.azure.com/#create/Microsoft.CognitiveServicesFormRecognizer) or [**multi-service**](https://portal.azure.com/#create/Microsoft.CognitiveServicesAllInOne) resource.
2929

3030
> [!TIP]
31-
> Create an Azure AI services resource if you plan to access multiple Azure AI services under a single endpoint/key. For Document Intelligence access only, create a Document Intelligence resource. Currently [Microsoft Entra authentication](../../../active-directory/authentication/overview-authentication.md) is not supported on Document Intelligence Studio to access Document Intelligence service APIs. To use Document Intelligence Studio, enabling access key-based authentication/local authentication is necessary.
31+
> Create an Azure AI services resource if you plan to access multiple Azure AI services under a single endpoint/key. For Document Intelligence access only, create a Document Intelligence resource. Please note that you'll need a single-service resource if you intend to use [Microsoft Entra authentication](../../../active-directory/authentication/overview-authentication.md).
3232
3333
#### Azure role assignments
3434

articles/ai-services/openai/concepts/abuse-monitoring.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: mrbullwinkle
66
ms.author: mbullwin
77
ms.service: azure-ai-openai
88
ms.topic: conceptual
9-
ms.date: 06/16/2023
9+
ms.date: 04/30/2024
1010
ms.custom: template-concept
1111
manager: nitinme
1212
---

articles/ai-services/openai/reference.md

Lines changed: 1 addition & 603 deletions
Large diffs are not rendered by default.

articles/ai-services/speech-service/includes/how-to/recognize-speech/cpp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ speechRecognizer->SessionStopped.Connect([&recognitionEnd](const SessionEventArg
171171
});
172172
```
173173
174-
With everything set up, call [`StopContinuousRecognitionAsync`](/cpp/cognitive-services/speech/speechrecognizer#startcontinuousrecognitionasync) to start recognizing:
174+
With everything set up, call [`StartContinuousRecognitionAsync`](/cpp/cognitive-services/speech/speechrecognizer#startcontinuousrecognitionasync) to start recognizing:
175175
176176
```cpp
177177
// Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.

articles/ai-services/speech-service/includes/how-to/translate-speech/csharp.md

Lines changed: 88 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,64 @@ static async Task TranslateSpeechAsync()
197197

198198
For more information about speech to text, see [the basics of speech recognition](../../../get-started-speech-to-text.md).
199199

200+
## Event based translation
201+
202+
The `TranslationRecognizer` object exposes a `Recognizing` event. The event fires several times and provides a mechanism to retrieve the intermediate translation results.
203+
204+
> [!NOTE]
205+
> Intermediate translation results aren't available when you use [multi-lingual speech translation](#multi-lingual-speech-translation-without-source-language-candidates).
206+
207+
The following example prints the intermediate translation results to the console:
208+
209+
```csharp
210+
using (var audioInput = AudioConfig.FromWavFileInput(@"whatstheweatherlike.wav"))
211+
{
212+
using (var translationRecognizer = new TranslationRecognizer(config, audioInput))
213+
{
214+
// Subscribes to events.
215+
translationRecognizer.Recognizing += (s, e) =>
216+
{
217+
Console.WriteLine($"RECOGNIZING in '{fromLanguage}': Text={e.Result.Text}");
218+
foreach (var element in e.Result.Translations)
219+
{
220+
Console.WriteLine($" TRANSLATING into '{element.Key}': {element.Value}");
221+
}
222+
};
223+
224+
translationRecognizer.Recognized += (s, e) => {
225+
if (e.Result.Reason == ResultReason.TranslatedSpeech)
226+
{
227+
Console.WriteLine($"RECOGNIZED in '{fromLanguage}': Text={e.Result.Text}");
228+
foreach (var element in e.Result.Translations)
229+
{
230+
Console.WriteLine($" TRANSLATED into '{element.Key}': {element.Value}");
231+
}
232+
}
233+
else if (e.Result.Reason == ResultReason.RecognizedSpeech)
234+
{
235+
Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
236+
Console.WriteLine($" Speech not translated.");
237+
}
238+
else if (e.Result.Reason == ResultReason.NoMatch)
239+
{
240+
Console.WriteLine($"NOMATCH: Speech could not be recognized.");
241+
}
242+
};
243+
244+
// Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
245+
Console.WriteLine("Start translation...");
246+
await translationRecognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);
247+
248+
// Waits for completion.
249+
// Use Task.WaitAny to keep the task rooted.
250+
Task.WaitAny(new[] { stopTranslation.Task });
251+
252+
// Stops translation.
253+
await translationRecognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
254+
}
255+
}
256+
```
257+
200258
## Synthesize translations
201259

202260
After a successful speech recognition and translation, the result contains all the translations in a dictionary. The [`Translations`][translations] dictionary key is the target translation language, and the value is the translated text. Recognized speech can be translated and then synthesized in a different language (speech-to-speech).
@@ -314,11 +372,40 @@ The following example anticipates that `en-US` or `zh-CN` should be detected bec
314372
speechTranslationConfig.AddTargetLanguage("de");
315373
speechTranslationConfig.AddTargetLanguage("fr");
316374
var autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "zh-CN" });
317-
var translationRecognizer = new TranslationRecognizer(speechTranslationConfig, autoDetectSourceLanguageConfig, audioConfig)
375+
var translationRecognizer = new TranslationRecognizer(speechTranslationConfig, autoDetectSourceLanguageConfig, audioConfig);
318376
```
319377

320378
For a complete code sample, see [language identification](../../../language-identification.md?pivots=programming-language-csharp#run-speech-translation).
321379

380+
381+
## Multi-lingual speech translation without source language candidates
382+
383+
Multi-lingual speech translation implements a new level of speech translation technology that unlocks various capabilities, including having no specified input language, and handling language switches within the same session. These features enable a new level of speech translation powers that can be implemented into your products.
384+
385+
Currently when you use Language ID with speech translation, you must create the `SpeechTranslationConfig` object from the v2 endpoint. Replace the string "YourServiceRegion" with your Speech resource region (such as "westus"). Replace "YourSubscriptionKey" with your Speech resource key.
386+
387+
```csharp
388+
var v2EndpointInString = String.Format("wss://{0}.stt.speech.microsoft.com/speech/universal/v2", "YourServiceRegion");
389+
var v2EndpointUrl = new Uri(v2EndpointInString);
390+
var speechTranslationConfig = SpeechTranslationConfig.FromEndpoint(v2EndpointUrl, "YourSubscriptionKey");
391+
```
392+
393+
Specify the translation target languages. Replace with languages of your choice. You can add more lines.
394+
```csharp
395+
config.AddTargetLanguage("de");
396+
config.AddTargetLanguage("fr");
397+
```
398+
399+
A key differentiator with multi-lingual speech translation is that you do not need to specify the source language. This is because the service will automatically detect the source language. Create the `AutoDetectSourceLanguageConfig` object with the `fromOpenRange` method to let the service know that you want to use multi-lingual speech translation with no specified source language.
400+
401+
```csharp
402+
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.fromOpenRange();
403+
var translationRecognizer = new TranslationRecognizer(speechTranslationConfig, autoDetectSourceLanguageConfig, audioConfig);
404+
```
405+
406+
For a complete code sample with the Speech SDK, see [speech translation samples on GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/translation_samples.cs#L472).
407+
408+
322409
[speechtranslationconfig]: /dotnet/api/microsoft.cognitiveservices.speech.speechtranslationconfig
323410
[audioconfig]: /dotnet/api/microsoft.cognitiveservices.speech.audio.audioconfig
324411
[translationrecognizer]: /dotnet/api/microsoft.cognitiveservices.speech.translation.translationrecognizer

articles/ai-services/speech-service/includes/language-support/speech-translation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
22
author: eric-urban
33
ms.service: azure-ai-speech
4-
ms.date: 08/22/2022
4+
ms.date: 4/24/2024
55
ms.topic: include
66
ms.author: eur
77
---
88

9-
| Text language| Language code |
9+
| Text language | Language code |
1010
|:------------------------|:-------------:|
1111
| Afrikaans | `af` |
1212
| Albanian | `sq` |

articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,42 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 3/13/2024
5+
ms.date: 4/22/2024
66
ms.author: eur
77
---
88

99
### April 2024 release
1010

11+
#### Automatic multi-lingual speech translation (Preview)
12+
13+
Automatic multi-lingual speech translation is available in public preview. This innovative feature revolutionizes the way language barriers are overcome, offering unparalleled capabilities for seamless communication across diverse linguistic landscapes.
14+
15+
##### Key Highlights
16+
17+
- Unspecified input language: Multi-lingual speech translation can receive audio in a wide range of languages, and there's no need to specify what the expected input language is. It makes it an invaluable feature to understand and collaborate across global contexts without the need for presetting.
18+
- Language switching: Multi-lingual speech translation allows for multiple languages to be spoken during the same session, and have them all translated into the same target language. There's no need to restart a session when the input language changes or any other actions by you.
19+
20+
##### How it works
21+
22+
- Travel interpreter: multi-lingual speech translation can enhance the experience of tourists visiting foreign destinations by providing them with information and assistance in their preferred language. Hotel concierge services, guided tours, and visitor centers can utilize this technology to cater to diverse linguistic needs.
23+
- International conferences: multi-lingual speech translation can facilitate communication among participants from different regions who might speak various languages using live translated caption. Attendees can speak in their native languages without needing to specify them, ensuring seamless understanding and collaboration.
24+
- Educational meetings: In multi-cultural classrooms or online learning environments, multi-lingual speech translation can support language diversity among students and teachers. It allows for seamless communication and participation without the need to specify each student's or instructor's language.
25+
26+
##### How to access
27+
28+
For a detailed introduction, visit [Speech translation overview](../../speech-translation.md). Additionally, you can refer to the code samples at [how to translate speech](../../how-to-translate-speech.md). This new feature is fully supported by all SDK versions from 1.37.0 onwards.
29+
1130
#### Real-time speech to text with diariazation (GA)
1231

1332
Real-time speech to text with diariazation is now generally available.
1433

15-
Check out [Real-time diarization quickstart](../../get-started-stt-diarization.md) to learn more about how to create speech to text applications that use diarization to distinguish between the different speakers who participate in the conversation.
34+
You can create speech to text applications that use diarization to distinguish between the different speakers who participate in the conversation. For more information about real-time diarization, Check out the [real-time diarization quickstart](../../get-started-stt-diarization.md).
1635

17-
#### Speech to Text model Update
36+
#### Speech to text model Update
1837

19-
[Real-time Speech to Text](../../how-to-recognize-speech.md) has released new models with bilingual capabilities. The `en-IN` model now support both English and Hindi bilingual scenarios and offers improved accuracy. Arabic locales (`ar-AE`, `ar-BH`, `ar-DZ`, `ar-IL`, `ar-IQ`, `ar-KW`, `ar-LB`, `ar-LY`, `ar-MA`, `ar-OM`, `ar-PS`, `ar-QA`, `ar-SA`, `ar-SY`, `ar-TN`, `ar-YE`) are now equipped with bilingual support for English, enhanced accuracy and call center support.
38+
[Real-time speech to text](../../how-to-recognize-speech.md) has released new models with bilingual capabilities. The `en-IN` model now supports both English and Hindi bilingual scenarios and offers improved accuracy. Arabic locales (`ar-AE`, `ar-BH`, `ar-DZ`, `ar-IL`, `ar-IQ`, `ar-KW`, `ar-LB`, `ar-LY`, `ar-MA`, `ar-OM`, `ar-PS`, `ar-QA`, `ar-SA`, `ar-SY`, `ar-TN`, `ar-YE`) are now equipped with bilingual support for English, enhanced accuracy and call center support.
2039

21-
[Batch transcription](../../batch-transcription.md) has launched models with new architecture for `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, `zh-CN`. These models significantly enhance readability and entity recognition.
40+
[Batch transcription](../../batch-transcription.md) provides models with new architecture for these locales: `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, and `zh-CN`. These models significantly enhance readability and entity recognition.
2241

2342
### March 2024 release
2443

@@ -82,7 +101,7 @@ How to Use:
82101

83102
Choose es-US (Spanish and English) or fr-CA (French and English) when you call the Speech Service API or try it out on Speech Studio. Feel free to speak either language or mix them together—the model is designed to adapt dynamically, providing accurate and context-aware responses in both languages.
84103

85-
It's time to elevate your communication game with our latest feature release—seamless, multilingual communication at your fingertips!
104+
It's time to elevate your communication game with our latest feature release—seamless, multi-lingual communication at your fingertips!
86105

87106
#### Speech To text models update
88107

articles/ai-services/speech-service/releasenotes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: eric-urban
77
ms.author: eur
88
ms.service: azure-ai-speech
99
ms.topic: release-notes
10-
ms.date: 1/21/2024
10+
ms.date: 4/22/2024
1111
ms.custom: references_regions
1212
---
1313

0 commit comments

Comments
 (0)