Skip to content

Commit 572e9ab

Browse files
authored
Merge pull request #186548 from ShawnJackson/get-started-speech-translation
edit pass: get-started-speech-translation
2 parents 1518fec + f0be909 commit 572e9ab

File tree

7 files changed

+197
-173
lines changed

7 files changed

+197
-173
lines changed

articles/cognitive-services/Speech-Service/get-started-speech-translation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Speech translation quickstart - Speech service
33
titleSuffix: Azure Cognitive Services
4-
description: Learn how to use the Speech SDK to translate speech. In this quickstart, you learn about object construction, supported audio input formats, and configuration options for speech translation.
4+
description: Learn how to use the Speech SDK to translate speech, including object construction, supported audio input formats, and configuration options.
55
services: cognitive-services
66
author: eric-urban
77
manager: nitinme
@@ -44,5 +44,5 @@ keywords: speech translation
4444

4545
## Next steps
4646

47-
* [Use codec compressed audio formats](how-to-use-codec-compressed-audio-input-streams.md)
47+
* Use [codec-compressed audio formats](how-to-use-codec-compressed-audio-input-streams.md).
4848
* See the [quickstart samples](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/quickstart) on GitHub.

articles/cognitive-services/Speech-Service/includes/how-to/speech-translation-basics/speech-translation-basics-cli.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,17 @@ ms.date: 04/13/2020
66
ms.author: eric-urban
77
---
88

9-
One of the core features of the Speech service is the ability to recognize human speech and translate it to other languages. In this quickstart you learn how to use the Speech SDK in your apps and products to perform high-quality speech translation. This quickstart translates speech from the microphone into text in another language.
9+
One of the core features of the Speech service is the ability to recognize human speech and translate it to other languages. In this quickstart, you learn how to use the Speech SDK in your apps and products to perform high-quality speech translation. This quickstart translates speech from the microphone to text in another language.
1010

1111
## Prerequisites
1212

13-
This article assumes that you have an Azure account and Speech service subscription. If you don't have an account and subscription, [try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
13+
This article assumes that you have an Azure account and a Speech service subscription. If you don't have an account and a subscription, [try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
1414

1515
[!INCLUDE [SPX Setup](../../spx-setup.md)]
1616

17-
## Set source and target language
17+
## Set source and target languages
1818

19-
This command calls Speech CLI to translate speech from the microphone from Italian to French.
19+
This command calls the Speech CLI to translate speech from the microphone from Italian to French:
2020

2121
```shell
2222
spx translate --microphone --source it-IT --target fr

articles/cognitive-services/Speech-Service/includes/how-to/speech-translation-basics/speech-translation-basics-cpp.md

Lines changed: 39 additions & 33 deletions
Large diffs are not rendered by default.

articles/cognitive-services/Speech-Service/includes/how-to/speech-translation-basics/speech-translation-basics-csharp.md

Lines changed: 39 additions & 33 deletions
Large diffs are not rendered by default.

articles/cognitive-services/Speech-Service/includes/how-to/speech-translation-basics/speech-translation-basics-java.md

Lines changed: 33 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ ms.custom: devx-track-java
77
ms.author: eur
88
---
99

10-
One of the core features of the Speech service is the ability to recognize human speech and translate it to other languages. In this quickstart you learn how to use the Speech SDK in your apps and products to perform high-quality speech translation. This quickstart covers topics including:
10+
One of the core features of the Speech service is the ability to recognize human speech and translate it to other languages. In this quickstart, you learn how to use the Speech SDK in your apps and products to perform high-quality speech translation. This quickstart covers these topics:
1111

12-
* Translating speech-to-text
12+
* Translating speech to text
1313
* Translating speech to multiple target languages
1414
* Performing direct speech-to-speech translation
1515

@@ -19,15 +19,15 @@ If you want to skip straight to sample code, see the [Java quickstart samples](h
1919

2020
## Prerequisites
2121

22-
This article assumes that you have an Azure account and Speech service subscription. If you don't have an account and subscription, [try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
22+
This article assumes that you have an Azure account and a Speech service subscription. If you don't have an account and a subscription, [try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
2323

24-
## Install the Speech SDK
24+
### Install the Speech SDK
2525

26-
Before you can do anything, you'll need to install the Speech SDK. Depending on your platform, follow the instructions under the <a href="/azure/cognitive-services/speech-service/speech-sdk#get-the-speech-sdk" target="_blank">Get the Speech SDK </a> section of the _About the Speech SDK_ article.
26+
Before you can do anything, you need to install the Speech SDK. Depending on your platform, follow the instructions in the <a href="/azure/cognitive-services/speech-service/speech-sdk#get-the-speech-sdk" target="_blank">Get the Speech SDK </a> section of the _About the Speech SDK_ article.
2727

28-
## Import dependencies
28+
### Import dependencies
2929

30-
To run the examples in this article, include the following `import` statements at the top of the **.Java* code file.
30+
To run the examples in this article, include the following `import` statements at the top of the **.Java* code file:
3131

3232
```java
3333
package speech;
@@ -42,7 +42,7 @@ import com.microsoft.cognitiveservices.speech.translation.*;
4242

4343
## Sensitive data and environment variables
4444

45-
The example source code in this article depends on environment variables for storing sensitive data, such as the Speech resource subscription key and region. The Java code file contains two `static final String` values that are assigned from the host machines environment variables, namely `SPEECH__SUBSCRIPTION__KEY` and `SPEECH__SERVICE__REGION`. Both of these fields are at the class scope, making them accessible within method bodies of the class. For more information on environment variables, see [environment variables and application configuration](../../../../cognitive-services-security.md#environment-variables-and-application-configuration).
45+
The example source code in this article depends on environment variables for storing sensitive data, such as the Speech resource's subscription key and region. The Java code file contains two `static final String` values that are assigned from the host machine's environment variables: `SPEECH__SUBSCRIPTION__KEY` and `SPEECH__SERVICE__REGION`. Both of these fields are at the class scope, so they're accessible within method bodies of the class:
4646

4747
```java
4848
public class App {
@@ -54,21 +54,23 @@ public class App {
5454
}
5555
```
5656

57+
For more information on environment variables, see [Environment variables and application configuration](../../../../cognitive-services-security.md#environment-variables-and-application-configuration).
58+
5759
## Create a speech translation configuration
5860

59-
To call the Speech service using the Speech SDK, you need to create a [`SpeechTranslationConfig`][config]. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
61+
To call the Speech service by using the Speech SDK, you need to create a [`SpeechTranslationConfig`][config] instance. This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
6062

6163
> [!TIP]
6264
> Regardless of whether you're performing speech recognition, speech synthesis, translation, or intent recognition, you'll always create a configuration.
6365
64-
There are a few ways that you can initialize a [`SpeechTranslationConfig`][config]:
66+
You can initialize a `SpeechTranslationConfig` instance in a few ways:
6567

6668
* With a subscription: pass in a key and the associated region.
6769
* With an endpoint: pass in a Speech service endpoint. A key or authorization token is optional.
6870
* With a host: pass in a host address. A key or authorization token is optional.
6971
* With an authorization token: pass in an authorization token and the associated region.
7072

71-
Let's take a look at how a [`SpeechTranslationConfig`][config] is created using a key and region. Get these credentials by following steps in [Try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
73+
Let's look at how you create a `SpeechTranslationConfig` instance by using a key and region. Get these credentials by following the steps in [Try the Speech service for free](../../../overview.md#try-the-speech-service-for-free).
7274

7375
```java
7476
public class App {
@@ -93,9 +95,9 @@ public class App {
9395
}
9496
```
9597

96-
## Change source language
98+
## Change the source language
9799

98-
One common task of speech translation is specifying the input (or source) language. Let's take a look at how you would change the input language to Italian. In your code, interact with the [`SpeechTranslationConfig`][config] instance, calling the `setSpeechRecognitionLanguage` method.
100+
One common task of speech translation is specifying the input (or source) language. The following example shows how you would change the input language to Italian. In your code, interact with the `SpeechTranslationConfig` instance by calling the `setSpeechRecognitionLanguage` method:
99101

100102
```java
101103
static void translateSpeech() {
@@ -107,11 +109,11 @@ static void translateSpeech() {
107109
}
108110
```
109111

110-
The [`setSpeechRecognitionLanguage`][recognitionlang] function expects a language-locale format string. You can provide any value in the **Locale** column in the list of supported [locales/languages](../../../language-support.md).
112+
The [`setSpeechRecognitionLanguage`][recognitionlang] function expects a language-locale format string. You can provide any value in the **Locale** column in the [list of supported locales/languages](../../../language-support.md).
111113

112-
## Add translation language
114+
## Add a translation language
113115

114-
Another common task of speech translation is to specify target translation languages, at least one is required but multiples are supported. The following code snippet sets both French and German as translation language targets.
116+
Another common task of speech translation is to specify target translation languages. At least one is required, but multiples are supported. The following code snippet sets both French and German as translation language targets:
115117

116118
```java
117119
static void translateSpeech() {
@@ -120,7 +122,7 @@ static void translateSpeech() {
120122

121123
translationConfig.setSpeechRecognitionLanguage("it-IT");
122124

123-
// Translate to languages. See, https://aka.ms/speech/sttt-languages
125+
// Translate to languages. See https://aka.ms/speech/sttt-languages
124126
translationConfig.addTargetLanguage("fr");
125127
translationConfig.addTargetLanguage("de");
126128
}
@@ -130,9 +132,9 @@ With every call to [`addTargetLanguage`][addlang], a new target translation lang
130132

131133
## Initialize a translation recognizer
132134

133-
After you've created a [`SpeechTranslationConfig`][config], the next step is to initialize a [`TranslationRecognizer`][recognizer]. When you initialize a [`TranslationRecognizer`][recognizer], you'll need to pass it your `translationConfig`. The configuration object provides the credentials that the speech service requires to validate your request.
135+
After you've created a [`SpeechTranslationConfig`][config] instance, the next step is to initialize [`TranslationRecognizer`][recognizer]. When you initialize `TranslationRecognizer`, you need to pass it your `translationConfig` instance. The configuration object provides the credentials that the Speech service requires to validate your request.
134136

135-
If you're recognizing speech using your device's default microphone, here's what the [`TranslationRecognizer`][recognizer] should look like:
137+
If you're recognizing speech by using your device's default microphone, here's what `TranslationRecognizer` should look like:
136138

137139
```java
138140
static void translateSpeech() {
@@ -151,12 +153,12 @@ static void translateSpeech() {
151153
}
152154
```
153155

154-
If you want to specify the audio input device, then you'll need to create an [`AudioConfig`][audioconfig] and provide the `audioConfig` parameter when initializing your [`TranslationRecognizer`][recognizer].
156+
If you want to specify the audio input device, then you need to create an [`AudioConfig`][audioconfig] class instance and provide the `audioConfig` parameter when initializing `TranslationRecognizer`.
155157

156158
> [!TIP]
157159
> [Learn how to get the device ID for your audio input device](../../../how-to-select-audio-input-devices.md).
158160
159-
First, you'll reference the `AudioConfig` object as follows:
161+
First, reference the `AudioConfig` object as follows:
160162

161163
```java
162164
static void translateSpeech() {
@@ -177,7 +179,7 @@ static void translateSpeech() {
177179
}
178180
```
179181

180-
If you want to provide an audio file instead of using a microphone, you'll still need to provide an `audioConfig`. However, when you create an [`AudioConfig`][audioconfig], instead of calling `fromDefaultMicrophoneInput`, you'll call `fromWavFileInput` and pass the `filename` parameter.
182+
If you want to provide an audio file instead of using a microphone, you still need to provide an `audioConfig` parameter. However, when you create an `AudioConfig` class instance, instead of calling `fromDefaultMicrophoneInput`, you call `fromWavFileInput` and pass the `filename` parameter:
181183

182184
```java
183185
static void translateSpeech() {
@@ -200,7 +202,7 @@ static void translateSpeech() {
200202

201203
## Translate speech
202204

203-
To translate speech, the Speech SDK relies on a microphone or an audio file input. Speech recognition occurs before speech translation. After all objects have been initialized, call the recognize-once function and get the result.
205+
To translate speech, the Speech SDK relies on a microphone or an audio file input. Speech recognition occurs before speech translation. After all objects have been initialized, call the recognize-once function and get the result:
204206

205207
```java
206208
static void translateSpeech() throws ExecutionException, InterruptedException {
@@ -232,14 +234,16 @@ For more information about speech-to-text, see [the basics of speech recognition
232234

233235
## Synthesize translations
234236

235-
After a successful speech recognition and translation, the result contains all the translations in a dictionary. The [`getTranslations`][translations] function returns a dictionary with the key as the target translation language and the value is the translated text. Recognized speech can be translated, then synthesized in a different language (speech-to-speech).
237+
After a successful speech recognition and translation, the result contains all the translations in a dictionary. The [`getTranslations`][translations] function returns a dictionary with the key as the target translation language and the value as the translated text. Recognized speech can be translated and then synthesized in a different language (speech-to-speech).
236238

237239
### Event-based synthesis
238240

239-
The `TranslationRecognizer` object exposes a `synthesizing` event. The event fires several times, and provides a mechanism to retrieve the synthesized audio from the translation recognition result. If you're translating to multiple languages, see [manual synthesis](#manual-synthesis). Specify the synthesis voice by assigning a [`setVoiceName`][voicename] and provide an event handler for the `synthesizing` event, get the audio. The following example saves the translated audio as a *.wav* file.
241+
The `TranslationRecognizer` object exposes a `synthesizing` event. The event fires several times and provides a mechanism to retrieve the synthesized audio from the translation recognition result. If you're translating to multiple languages, see [Manual synthesis](#manual-synthesis).
242+
243+
Specify the synthesis voice by assigning a [`setVoiceName`][voicename] instance, and provide an event handler for the `synthesizing` event to get the audio. The following example saves the translated audio as a .wav file.
240244

241245
> [!IMPORTANT]
242-
> The event-based synthesis only works with a single translation, **do not** add multiple target translation languages. Additionally, the [`setVoiceName`][voicename] should be the same language as the target translation language, for example; `"de"` could map to `"de-DE-Hedda"`.
246+
> The event-based synthesis works only with a single translation. *Do not* add multiple target translation languages. Additionally, the `setVoiceName` value should be the same language as the target translation language. For example, `"de"` could map to `"de-DE-Hedda"`.
243247
244248
```java
245249
static void translateSpeech() throws ExecutionException, FileNotFoundException, InterruptedException, IOException {
@@ -286,7 +290,9 @@ static void translateSpeech() throws ExecutionException, FileNotFoundException,
286290

287291
### Manual synthesis
288292

289-
The [`getTranslations`][translations] function returns a dictionary that can be used to synthesize audio from the translation text. Iterate through each translation, and synthesize the translation. When creating a `SpeechSynthesizer` instance, the `SpeechConfig` object needs to have its [`setSpeechSynthesisVoiceName`][speechsynthesisvoicename] property set to the desired voice. The following example translates to five languages, and each translation is then synthesized to an audio file in the corresponding neural language.
293+
The [`getTranslations`][translations] function returns a dictionary that you can use to synthesize audio from the translation text. Iterate through each translation and synthesize it. When you're creating a `SpeechSynthesizer` instance, the `SpeechConfig` object needs to have its [`setSpeechSynthesisVoiceName`][speechsynthesisvoicename] property set to the desired voice.
294+
295+
The following example translates to five languages. Each translation is then synthesized to an audio file in the corresponding neural language.
290296

291297
```java
292298
static void translateSpeech() throws ExecutionException, InterruptedException {

0 commit comments

Comments
 (0)