You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/speech-to-text-basics/speech-to-text-basics-javascript.md
+34-26Lines changed: 34 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,15 @@ Additionally, depending on the target environment use one of the following:
For more information on `import`, see <ahref="https://javascript.info/import-export"target="_blank">export and import <spanclass="docon docon-navigate-external x-hidden-focus"></span></a>.
@@ -39,14 +47,14 @@ For more information on `require`, see <a href="https://nodejs.org/en/knowledge/
39
47
40
48
# [script](#tab/script)
41
49
42
-
Download and extract the <ahref="https://aka.ms/csspeech/jsbrowserpackage"target="_blank">JavaScript Speech SDK <spanclass="docon docon-navigate-external x-hidden-focus"></span></a> *microsoft.cognitiveservices.speech.sdk.bundle.js* file, and place it in a folder accessible to your HTML file.
50
+
Download and extract the <ahref="https://aka.ms/csspeech/jsbrowserpackage"target="_blank">JavaScript Speech SDK <spanclass="docon docon-navigate-external x-hidden-focus"></span></a> *microsoft.cognitiveservices.speech.bundle.js* file, and place it in a folder accessible to your HTML file.
> If you're targeting a web browser, and using the `<script>` tag; the `sdk` prefix is not needed. The `sdk` prefix is an alias we use to name our `import` or`require` module.
57
+
> If you're targeting a web browser, and using the `<script>` tag; the `sdk` prefix is not needed. The `sdk` prefix is an alias used to name the`require` module.
50
58
51
59
---
52
60
@@ -77,7 +85,7 @@ After you've created a [`SpeechConfig`](https://docs.microsoft.com/javascript/ap
77
85
If you're recognizing speech using your device's default microphone, here's what the [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest) should look like:
If you want to specify the audio input device, then you'll need to create an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest) and provide the `audioConfig` parameter when initializing your [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest).
@@ -88,15 +96,15 @@ If you want to specify the audio input device, then you'll need to create an [`A
If you want to provide an audio file instead of using a microphone, you'll still need to provide an `audioConfig`. However, this can only be done when targeting **Node.js** and when you create an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest), instead of calling `fromDefaultMicrophoneInput`, you'll call `fromWavFileOutput` and pass the `filename` parameter.
console.log("CANCELED: Did you update the subscription info?");
@@ -155,7 +163,7 @@ Continuous recognition is a bit more involved than single-shot recognition. It r
155
163
Let's start by defining the input and initializing a [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest):
We'll subscribe to the events sent from the [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest).
@@ -166,32 +174,32 @@ We'll subscribe to the events sent from the [`SpeechRecognizer`](https://docs.mi
166
174
*[`canceled`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#canceled): Signal for events containing canceled recognition results (indicating a recognition attempt that was canceled as a result or a direct cancellation request or, alternatively, a transport or protocol failure).
console.log("CANCELED: Did you update the subscription info?");
189
197
}
190
198
191
199
recognizer.stopContinuousRecognitionAsync();
192
200
};
193
201
194
-
recognizer.SessionStopped= (s, e) => {
202
+
recognizer.sessionStopped= (s, e) => {
195
203
console.log("\n Session stopped event.");
196
204
recognizer.stopContinuousRecognitionAsync();
197
205
};
@@ -229,7 +237,7 @@ The [`speechRecognitionLanguage`](https://docs.microsoft.com/javascript/api/micr
229
237
230
238
## Improve recognition accuracy
231
239
232
-
There are a few ways to improve recognition accuracy with the Speech SDK. Let's take a look at Phrase Lists. Phrase Lists are used to identify known phrases in audio data, like a person's name or a specific location. Single words or complete phrases can be added to a Phrase List. During recognition, an entry in a phrase list is used if an exact match for the entire phrase is included in the audio. If an exact match to the phrase is not found, recognition is not assisted.
240
+
There are a few ways to improve recognition accuracy with the Speech Let's take a look at Phrase Lists. Phrase Lists are used to identify known phrases in audio data, like a person's name or a specific location. Single words or complete phrases can be added to a Phrase List. During recognition, an entry in a phrase list is used if an exact match for the entire phrase is included in the audio. If an exact match to the phrase is not found, recognition is not assisted.
233
241
234
242
> [!IMPORTANT]
235
243
> The Phrase List feature is only available in English.
@@ -239,7 +247,7 @@ To use a phrase list, first create a [`PhraseListGrammar`](https://docs.microsof
239
247
Any changes to [`PhraseListGrammar`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/phraselistgrammar?view=azure-node-latest) take effect on the next recognition or after a reconnection to the Speech service.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-javascript.md
+63-20Lines changed: 63 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,10 +22,12 @@ Additionally, depending on the target environment use one of the following:
22
22
# [import](#tab/import)
23
23
24
24
```javascript
25
+
import { readFileSync } from"fs";
25
26
import {
26
27
AudioConfig,
27
28
SpeechConfig,
28
-
SpeechSynthesizer
29
+
SpeechSynthesisOutputFormat,
30
+
SpeechSynthesizer
29
31
} from"microsoft-cognitiveservices-speech-sdk";
30
32
```
31
33
@@ -34,6 +36,7 @@ For more information on `import`, see <a href="https://javascript.info/import-ex
Next, you create a [`SpeechSynthesizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechsynthesizer?view=azure-node-latest) object, which executes text-to-speech conversions and outputs to speakers, files, or other output streams. The [`SpeechSynthesizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechsynthesizer?view=azure-node-latest) accepts as params the [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest) object created in the previous step, and an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest) object that specifies how output results should be handled.
82
84
@@ -86,19 +88,30 @@ To start, create an `AudioConfig` to automatically write the output to a `.wav`
Next, instantiate a `SpeechSynthesizer` passing your `speechConfig` object and the `audioConfig` object as params. Then, executing speech synthesis and writing to a file is as simple as running `speakTextAsync()` with a string of text.
94
+
Next, instantiate a `SpeechSynthesizer` passing your `speechConfig` object and the `audioConfig` object as params. Then, executing speech synthesis and writing to a file is as simple as running `speakTextAsync()` with a string of text. The result callback is a great place to call `synthesizer.close()`, in fact - this call is needed in order for synthesis to function correctly.
synthesizer.speakTextAsync("Synthesizing directly to speaker output.");
130
+
synthesizer.speakTextAsync(
131
+
"Synthesizing directly to speaker output.",
132
+
result=> {
133
+
if (result) {
134
+
console.log(JSON.stringify(result));
135
+
}
136
+
synthesizer.close();
137
+
},
138
+
error=> {
139
+
console.log(error);
140
+
synthesizer.close();
141
+
});
118
142
}
119
143
```
120
144
@@ -126,26 +150,31 @@ For many scenarios in speech application development, you likely need the result
126
150
* Integrate the result with other API's or services.
127
151
* Modify the audio data, write custom `.wav` headers, etc.
128
152
129
-
It's simple to make this change from the previous example. First, remove the `AudioConfig` block, as you will manage the output behavior manually from this point onward for increased control. Then pass `null` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
153
+
It's simple to make this change from the previous example. First, remove the `AudioConfig` block, as you will manage the output behavior manually from this point onward for increased control. Then pass `undefined` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
130
154
131
155
> [!NOTE]
132
-
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example above, will not play the audio by default on the current active output device.
156
+
> Passing `undefined` for the `AudioConfig`, rather than omitting it like in the speaker output example above, will not play the audio by default on the current active output device.
133
157
134
158
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechsynthesisresult?view=azure-node-latest) variable. The `SpeechSynthesisResult.audioData` property returns an `ArrayBuffer` of the output data. You can work with this `ArrayBuffer` manually.
console.log(`Audio data byte size: ${audioData.byteLength}.`)
211
+
212
+
synthesizer.close();
181
213
},
182
-
error=>console.log(error));
214
+
error=> {
215
+
console.log(error);
216
+
synthesizer.close();
217
+
});
183
218
}
184
219
```
185
220
@@ -209,21 +244,29 @@ function xmlToString(filePath) {
209
244
}
210
245
```
211
246
212
-
From here, the result object is exactly the same as previous examples.
247
+
For more information on `readFileSync`, see <ahref="https://nodejs.org/api/fs.html#fs_fs_readlinksync_path_options"target="_blank">Node.js file system<spanclass="docon docon-navigate-external x-hidden-focus"></span></a>. From here, the result object is exactly the same as previous examples.
0 commit comments