Skip to content

Commit 7b78e69

Browse files
committed
Sweeping updates
1 parent 3bb6902 commit 7b78e69

File tree

2 files changed

+97
-46
lines changed

2 files changed

+97
-46
lines changed

articles/cognitive-services/Speech-Service/includes/how-to/speech-to-text-basics/speech-to-text-basics-javascript.md

Lines changed: 34 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,15 @@ Additionally, depending on the target environment use one of the following:
2323
# [import](#tab/import)
2424

2525
```javascript
26-
import * as sdk from "microsoft-cognitiveservices-speech-sdk";
26+
import {
27+
AudioConfig,
28+
CancellationDetails,
29+
CancellationReason,
30+
PhraseListGrammar,
31+
ResultReason,
32+
SpeechConfig,
33+
SpeechRecognizer
34+
} from "microsoft-cognitiveservices-speech-sdk";
2735
```
2836

2937
For more information on `import`, see <a href="https://javascript.info/import-export" target="_blank">export and import <span class="docon docon-navigate-external x-hidden-focus"></span></a>.
@@ -39,14 +47,14 @@ For more information on `require`, see <a href="https://nodejs.org/en/knowledge/
3947

4048
# [script](#tab/script)
4149

42-
Download and extract the <a href="https://aka.ms/csspeech/jsbrowserpackage" target="_blank">JavaScript Speech SDK <span class="docon docon-navigate-external x-hidden-focus"></span></a> *microsoft.cognitiveservices.speech.sdk.bundle.js* file, and place it in a folder accessible to your HTML file.
50+
Download and extract the <a href="https://aka.ms/csspeech/jsbrowserpackage" target="_blank">JavaScript Speech SDK <span class="docon docon-navigate-external x-hidden-focus"></span></a> *microsoft.cognitiveservices.speech.bundle.js* file, and place it in a folder accessible to your HTML file.
4351

4452
```html
45-
<script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>;
53+
<script src="microsoft.cognitiveservices.speech.bundle.js"></script>;
4654
```
4755

4856
> [!TIP]
49-
> If you're targeting a web browser, and using the `<script>` tag; the `sdk` prefix is not needed. The `sdk` prefix is an alias we use to name our `import` or `require` module.
57+
> If you're targeting a web browser, and using the `<script>` tag; the `sdk` prefix is not needed. The `sdk` prefix is an alias used to name the `require` module.
5058
5159
---
5260

@@ -77,7 +85,7 @@ After you've created a [`SpeechConfig`](https://docs.microsoft.com/javascript/ap
7785
If you're recognizing speech using your device's default microphone, here's what the [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest) should look like:
7886

7987
```javascript
80-
const recognizer = new sdk.SpeechRecognizer(speechConfig);
88+
const recognizer = new SpeechRecognizer(speechConfig);
8189
```
8290

8391
If you want to specify the audio input device, then you'll need to create an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest) and provide the `audioConfig` parameter when initializing your [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest).
@@ -88,15 +96,15 @@ If you want to specify the audio input device, then you'll need to create an [`A
8896
Reference the `AudioConfig` object as follows:
8997

9098
```javascript
91-
const audioConfig = sdk.AudioConfig.fromDefaultMicrophoneInput();
92-
const speechConfig = sdk.SpeechConfig.fromSubscription(speechConfig, audioConfig);
99+
const audioConfig = AudioConfig.fromDefaultMicrophoneInput();
100+
const speechConfig = SpeechConfig.fromSubscription(speechConfig, audioConfig);
93101
```
94102

95103
If you want to provide an audio file instead of using a microphone, you'll still need to provide an `audioConfig`. However, this can only be done when targeting **Node.js** and when you create an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest), instead of calling `fromDefaultMicrophoneInput`, you'll call `fromWavFileOutput` and pass the `filename` parameter.
96104

97105
```javascript
98-
const audioConfig = sdk.AudioConfig.fromWavFileInput("YourAudioFile.wav");
99-
const speechConfig = sdk.SpeechConfig.fromSubscription(speechConfig, audioConfig);
106+
const audioConfig = AudioConfig.fromWavFileInput("YourAudioFile.wav");
107+
const speechConfig = SpeechConfig.fromSubscription(speechConfig, audioConfig);
100108
```
101109

102110
## Recognize speech
@@ -127,18 +135,18 @@ You'll need to write some code to handle the result. This sample evaluates the [
127135

128136
```javascript
129137
switch (result.reason) {
130-
case sdk.ResultReason.RecognizedSpeech:
138+
case ResultReason.RecognizedSpeech:
131139
console.log(`RECOGNIZED: Text=${result.text}`);
132140
console.log(" Intent not recognized.");
133141
break;
134-
case sdk.ResultReason.NoMatch:
142+
case ResultReason.NoMatch:
135143
console.log("NOMATCH: Speech could not be recognized.");
136144
break;
137-
case sdk.ResultReason.Canceled:
138-
const cancellation = sdk.CancellationDetails.fromResult(result);
145+
case ResultReason.Canceled:
146+
const cancellation = CancellationDetails.fromResult(result);
139147
console.log(`CANCELED: Reason=${cancellation.reason}`);
140148

141-
if (cancellation.reason == sdk.CancellationReason.Error) {
149+
if (cancellation.reason == CancellationReason.Error) {
142150
console.log(`CANCELED: ErrorCode=${cancellation.ErrorCode}`);
143151
console.log(`CANCELED: ErrorDetails=${cancellation.errorDetails}`);
144152
console.log("CANCELED: Did you update the subscription info?");
@@ -155,7 +163,7 @@ Continuous recognition is a bit more involved than single-shot recognition. It r
155163
Let's start by defining the input and initializing a [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest):
156164

157165
```javascript
158-
const recognizer = new sdk.SpeechRecognizer(speechConfig);
166+
const recognizer = new SpeechRecognizer(speechConfig);
159167
```
160168

161169
We'll subscribe to the events sent from the [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest).
@@ -166,32 +174,32 @@ We'll subscribe to the events sent from the [`SpeechRecognizer`](https://docs.mi
166174
* [`canceled`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#canceled): Signal for events containing canceled recognition results (indicating a recognition attempt that was canceled as a result or a direct cancellation request or, alternatively, a transport or protocol failure).
167175

168176
```javascript
169-
recognizer.Recognizing = (s, e) => {
170-
console.log(`RECOGNIZING: Text=${e.Result.text}`);
177+
recognizer.recognizing = (s, e) => {
178+
console.log(`RECOGNIZING: Text=${e.result.text}`);
171179
};
172180

173181
recognizer.recognized = (s, e) => {
174-
if (e.Result.reason == sdk.ResultReason.RecognizedSpeech) {
175-
console.log(`RECOGNIZED: Text=${e.Result.Text}`);
182+
if (e.result.reason == ResultReason.RecognizedSpeech) {
183+
console.log(`RECOGNIZED: Text=${e.result.text}`);
176184
}
177-
else if (e.Result.reason == sdk.ResultReason.NoMatch) {
185+
else if (e.result.reason == ResultReason.NoMatch) {
178186
console.log("NOMATCH: Speech could not be recognized.");
179187
}
180188
};
181189

182-
recognizer.Canceled = (s, e) => {
190+
recognizer.canceled = (s, e) => {
183191
console.log(`CANCELED: Reason=${e.reason}`);
184192

185-
if (e.reason == sdk.CancellationReason.Error) {
193+
if (e.reason == CancellationReason.Error) {
186194
console.log(`"CANCELED: ErrorCode=${e.errorCode}`);
187-
console.log(`"CANCELED: ErrorDetails=${e.ErrorDetails}`);
195+
console.log(`"CANCELED: ErrorDetails=${e.errorDetails}`);
188196
console.log("CANCELED: Did you update the subscription info?");
189197
}
190198

191199
recognizer.stopContinuousRecognitionAsync();
192200
};
193201

194-
recognizer.SessionStopped = (s, e) => {
202+
recognizer.sessionStopped = (s, e) => {
195203
console.log("\n Session stopped event.");
196204
recognizer.stopContinuousRecognitionAsync();
197205
};
@@ -229,7 +237,7 @@ The [`speechRecognitionLanguage`](https://docs.microsoft.com/javascript/api/micr
229237

230238
## Improve recognition accuracy
231239

232-
There are a few ways to improve recognition accuracy with the Speech SDK. Let's take a look at Phrase Lists. Phrase Lists are used to identify known phrases in audio data, like a person's name or a specific location. Single words or complete phrases can be added to a Phrase List. During recognition, an entry in a phrase list is used if an exact match for the entire phrase is included in the audio. If an exact match to the phrase is not found, recognition is not assisted.
240+
There are a few ways to improve recognition accuracy with the Speech Let's take a look at Phrase Lists. Phrase Lists are used to identify known phrases in audio data, like a person's name or a specific location. Single words or complete phrases can be added to a Phrase List. During recognition, an entry in a phrase list is used if an exact match for the entire phrase is included in the audio. If an exact match to the phrase is not found, recognition is not assisted.
233241

234242
> [!IMPORTANT]
235243
> The Phrase List feature is only available in English.
@@ -239,7 +247,7 @@ To use a phrase list, first create a [`PhraseListGrammar`](https://docs.microsof
239247
Any changes to [`PhraseListGrammar`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/phraselistgrammar?view=azure-node-latest) take effect on the next recognition or after a reconnection to the Speech service.
240248

241249
```javascript
242-
const phraseList = sdk.PhraseListGrammar.fromRecognizer(recognizer);
250+
const phraseList = PhraseListGrammar.fromRecognizer(recognizer);
243251
phraseList.addPhrase("Supercalifragilisticexpialidocious");
244252
```
245253

articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-javascript.md

Lines changed: 63 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,12 @@ Additionally, depending on the target environment use one of the following:
2222
# [import](#tab/import)
2323

2424
```javascript
25+
import { readFileSync } from "fs";
2526
import {
2627
AudioConfig,
2728
SpeechConfig,
28-
SpeechSynthesizer
29+
SpeechSynthesisOutputFormat,
30+
SpeechSynthesizer
2931
} from "microsoft-cognitiveservices-speech-sdk";
3032
```
3133

@@ -34,6 +36,7 @@ For more information on `import`, see <a href="https://javascript.info/import-ex
3436
# [require](#tab/require)
3537

3638
```javascript
39+
const readFileSync = require("fs").readFileSync;
3740
const sdk = require("microsoft-cognitiveservices-speech-sdk");
3841
```
3942

@@ -72,11 +75,10 @@ In this example, you create a [`SpeechConfig`](https://docs.microsoft.com/javasc
7275
```javascript
7376
function synthesizeSpeech() {
7477
const speechConfig = SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
75-
const synthesizer = new SpeechSynthesizer(speechConfig);
7678
}
7779
```
7880

79-
## Synthesize speech from a file
81+
## Synthesize speech to a file
8082

8183
Next, you create a [`SpeechSynthesizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechsynthesizer?view=azure-node-latest) object, which executes text-to-speech conversions and outputs to speakers, files, or other output streams. The [`SpeechSynthesizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechsynthesizer?view=azure-node-latest) accepts as params the [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest) object created in the previous step, and an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest) object that specifies how output results should be handled.
8284

@@ -86,19 +88,30 @@ To start, create an `AudioConfig` to automatically write the output to a `.wav`
8688
function synthesizeSpeech() {
8789
const speechConfig = SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
8890
const audioConfig = AudioConfig.fromAudioFileOutput("path/to/file.wav");
89-
const synthesizer = new SpeechSynthesizer(speechConfig, audioConfig);
9091
}
9192
```
9293

93-
Next, instantiate a `SpeechSynthesizer` passing your `speechConfig` object and the `audioConfig` object as params. Then, executing speech synthesis and writing to a file is as simple as running `speakTextAsync()` with a string of text.
94+
Next, instantiate a `SpeechSynthesizer` passing your `speechConfig` object and the `audioConfig` object as params. Then, executing speech synthesis and writing to a file is as simple as running `speakTextAsync()` with a string of text. The result callback is a great place to call `synthesizer.close()`, in fact - this call is needed in order for synthesis to function correctly.
9495

9596
```javascript
9697
function synthesizeSpeech() {
9798
const speechConfig = sdk.SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
98-
const audioConfig = AudioConfig.fromAudioFileOutput("path/to/file.wav");
99+
const audioConfig = AudioConfig.fromAudioFileOutput("path-to-file.wav");
99100

100101
const synthesizer = new SpeechSynthesizer(speechConfig, audioConfig);
101-
synthesizer.speakTextAsync("A simple test to write to a file.");
102+
synthesizer.speakTextAsync(
103+
"A simple test to write to a file.",
104+
result => {
105+
if (result) {
106+
console.log(JSON.stringify(result));
107+
}
108+
synthesizer.close();
109+
}
110+
},
111+
error => {
112+
console.log(error);
113+
synthesizer.close();
114+
});
102115
}
103116
```
104117

@@ -114,7 +127,18 @@ function synthesizeSpeech() {
114127
const audioConfig = AudioConfig.fromDefaultSpeakerOutput();
115128

116129
const synthesizer = new SpeechSynthesizer(speechConfig, audioConfig);
117-
synthesizer.speakTextAsync("Synthesizing directly to speaker output.");
130+
synthesizer.speakTextAsync(
131+
"Synthesizing directly to speaker output.",
132+
result => {
133+
if (result) {
134+
console.log(JSON.stringify(result));
135+
}
136+
synthesizer.close();
137+
},
138+
error => {
139+
console.log(error);
140+
synthesizer.close();
141+
});
118142
}
119143
```
120144

@@ -126,26 +150,31 @@ For many scenarios in speech application development, you likely need the result
126150
* Integrate the result with other API's or services.
127151
* Modify the audio data, write custom `.wav` headers, etc.
128152

129-
It's simple to make this change from the previous example. First, remove the `AudioConfig` block, as you will manage the output behavior manually from this point onward for increased control. Then pass `null` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
153+
It's simple to make this change from the previous example. First, remove the `AudioConfig` block, as you will manage the output behavior manually from this point onward for increased control. Then pass `undefined` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
130154

131155
> [!NOTE]
132-
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example above, will not play the audio by default on the current active output device.
156+
> Passing `undefined` for the `AudioConfig`, rather than omitting it like in the speaker output example above, will not play the audio by default on the current active output device.
133157
134158
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechsynthesisresult?view=azure-node-latest) variable. The `SpeechSynthesisResult.audioData` property returns an `ArrayBuffer` of the output data. You can work with this `ArrayBuffer` manually.
135159

136160
```javascript
137161
function synthesizeSpeech() {
138162
const speechConfig = sdk.SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
139-
const synthesizer = new sdk.SpeechSynthesizer(speechConfig, null);
163+
const synthesizer = new sdk.SpeechSynthesizer(speechConfig);
140164

141165
synthesizer.speakTextAsync(
142166
"Getting the response as an in-memory stream.",
143167
result => {
144168
// Interact with the audio ArrayBuffer data
145169
const audioData = result.audioData;
146170
console.log(`Audio data byte size: ${audioData.byteLength}.`)
171+
172+
synthesizer.close();
147173
},
148-
error => console.log(error));
174+
error => {
175+
console.log(error);
176+
synthesizer.close();
177+
});
149178
}
150179
```
151180

@@ -172,14 +201,20 @@ function synthesizeSpeech() {
172201
// Set the output format
173202
speechConfig.speechSynthesisOutputFormat = SpeechSynthesisOutputFormat.Riff24Khz16BitMonoPcm;
174203

175-
const synthesizer = new sdk.SpeechSynthesizer(speechConfig, null);
204+
const synthesizer = new sdk.SpeechSynthesizer(speechConfig, undefined);
176205
synthesizer.speakTextAsync(
177206
"Customizing audio output format.",
178207
result => {
179208
// Interact with the audio ArrayBuffer data
180209
const audioData = result.audioData;
210+
console.log(`Audio data byte size: ${audioData.byteLength}.`)
211+
212+
synthesizer.close();
181213
},
182-
error => console.log(error));
214+
error => {
215+
console.log(error);
216+
synthesizer.close();
217+
});
183218
}
184219
```
185220

@@ -209,21 +244,29 @@ function xmlToString(filePath) {
209244
}
210245
```
211246

212-
From here, the result object is exactly the same as previous examples.
247+
For more information on `readFileSync`, see <a href="https://nodejs.org/api/fs.html#fs_fs_readlinksync_path_options" target="_blank">Node.js file system<span class="docon docon-navigate-external x-hidden-focus"></span></a>. From here, the result object is exactly the same as previous examples.
213248

214249
```javascript
215250
function synthesizeSpeech() {
216251
const speechConfig = sdk.SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
217-
const synthesizer = new sdk.SpeechSynthesizer(speechConfig, null);
252+
const synthesizer = new sdk.SpeechSynthesizer(speechConfig, undefined);
218253

219-
const xml = xmlToString("ssml.xml");
254+
const ssml = xmlToString("ssml.xml");
220255
synthesizer.speakSsmlAsync(
221256
ssml,
222257
result => {
223-
// Interact with the audio ArrayBuffer data
224-
const audioData = result.audioData;
258+
if (result.errorDetails) {
259+
console.error(result.errorDetails);
260+
} else {
261+
console.log(JSON.stringify(result));
262+
}
263+
264+
synthesizer.close();
225265
},
226-
error => console.log(error));
266+
error => {
267+
console.log(error);
268+
synthesizer.close();
269+
});
227270
}
228271
```
229272

0 commit comments

Comments
 (0)