Skip to content

Commit f67e943

Browse files
authored
Merge pull request #111182 from IEvangelist/basicsWithJS
Attempting to add JS to basics
2 parents 2f77e64 + 010e959 commit f67e943

File tree

4 files changed

+280
-3
lines changed

4 files changed

+280
-3
lines changed

articles/cognitive-services/Speech-Service/includes/how-to/speech-to-text-basics/speech-to-text-basics-csharp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
7878

7979
## Recognize speech
8080

81-
The [Recognizer class](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer?view=azure-dotne) for the Speech SDK for C# exposes a few methods that you can use for speech recognition.
81+
The [Recognizer class](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer?view=azure-dotnet) for the Speech SDK for C# exposes a few methods that you can use for speech recognition.
8282

8383
* Single-shot recognition (async) - Performs recognition in a non-blocking (asynchronous) mode. This will recognize a single utterance. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed.
8484
* Continuous recognition (async) - Asynchronously initiates continuous recognition operation. The user registers to events and handles various application state. To stop asynchronous continuous recognition, call [`StopContinuousRecognitionAsync`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer.stopcontinuousrecognitionasync?view=azure-dotnet).
Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
---
2+
author: IEvangelist
3+
ms.service: cognitive-services
4+
ms.topic: include
5+
ms.date: 04/14/2020
6+
ms.author: dapine
7+
---
8+
9+
## Prerequisites
10+
11+
This article assumes that you have an Azure account and Speech service subscription. If you don't have an account and subscription, [try the Speech service for free](../../../get-started.md).
12+
13+
## Install the Speech SDK
14+
15+
Before you can do anything, you'll need to install the <a href="https://www.npmjs.com/package/microsoft-cognitiveservices-speech-sdk" target="_blank">JavaScript Speech SDK <span class="docon docon-navigate-external x-hidden-focus"></span></a>. Depending on your platform, use the following instructions:
16+
17+
- <a href="https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-sdk?tabs=nodejs#get-the-speech-sdk" target="_blank">Node.js <span
18+
class="docon docon-navigate-external x-hidden-focus"></span></a>
19+
- <a href="https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-sdk?tabs=browser#get-the-speech-sdk" target="_blank">Web Browser <span class="docon docon-navigate-external x-hidden-focus"></span></a>
20+
21+
Additionally, depending on the target environment use one of the following:
22+
23+
# [import](#tab/import)
24+
25+
```javascript
26+
import * as sdk from "microsoft-cognitiveservices-speech-sdk";
27+
```
28+
29+
For more information on `import`, see <a href="https://javascript.info/import-export" target="_blank">export and import <span class="docon docon-navigate-external x-hidden-focus"></span></a>.
30+
31+
# [require](#tab/require)
32+
33+
```javascript
34+
const sdk = require("microsoft-cognitiveservices-speech-sdk");
35+
```
36+
37+
For more information on `require`, see <a href="https://nodejs.org/en/knowledge/getting-started/what-is-require/" target="_blank">what is require? <span class="docon docon-navigate-external x-hidden-focus"></span></a>.
38+
39+
40+
# [script](#tab/script)
41+
42+
Download and extract the <a href="https://aka.ms/csspeech/jsbrowserpackage" target="_blank">JavaScript Speech SDK <span class="docon docon-navigate-external x-hidden-focus"></span></a> *microsoft.cognitiveservices.speech.sdk.bundle.js* file, and place it in a folder accessible to your HTML file.
43+
44+
```html
45+
<script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>;
46+
```
47+
48+
> [!TIP]
49+
> If you're targeting a web browser, and using the `<script>` tag; the `sdk` prefix is not needed. The `sdk` prefix is an alias we use to name our `import` or `require` module.
50+
51+
---
52+
53+
## Create a speech configuration
54+
55+
To call the Speech service using the Speech SDK, you need to create a [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest). This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
56+
57+
> [!NOTE]
58+
> Regardless of whether you're performing speech recognition, speech synthesis, translation, or intent recognition, you'll always create a configuration.
59+
60+
There are a few ways that you can initialize a [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest):
61+
62+
* With a subscription: pass in a key and the associated region.
63+
* With an endpoint: pass in a Speech service endpoint. A key or authorization token is optional.
64+
* With a host: pass in a host address. A key or authorization token is optional.
65+
* With an authorization token: pass in an authorization token and the associated region.
66+
67+
Let's take a look at how a [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest) is created using a key and region. See the [region support](https://docs.microsoft.com/azure/cognitive-services/speech-service/regions#speech-sdk) page to find your region identifier.
68+
69+
```javascript
70+
const speechConfig = sdk.SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
71+
```
72+
73+
## Initialize a recognizer
74+
75+
After you've created a [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest), the next step is to initialize a [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest). When you initialize a [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest), you'll need to pass it your `speechConfig`. This provides the credentials that the speech service requires to validate your request.
76+
77+
If you're recognizing speech using your device's default microphone, here's what the [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest) should look like:
78+
79+
```javascript
80+
const recognizer = new sdk.SpeechRecognizer(speechConfig);
81+
```
82+
83+
If you want to specify the audio input device, then you'll need to create an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest) and provide the `audioConfig` parameter when initializing your [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest).
84+
85+
> [!TIP]
86+
> [Learn how to get the device ID for your audio input device](../../../how-to-select-audio-input-devices.md).
87+
88+
Reference the `AudioConfig` object as follows:
89+
90+
```javascript
91+
const audioConfig = sdk.AudioConfig.fromDefaultMicrophoneInput();
92+
const speechConfig = sdk.SpeechConfig.fromSubscription(speechConfig, audioConfig);
93+
```
94+
95+
If you want to provide an audio file instead of using a microphone, you'll still need to provide an `audioConfig`. However, this can only be done when targeting **Node.js** and when you create an [`AudioConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest), instead of calling `fromDefaultMicrophoneInput`, you'll call `fromWavFileOutput` and pass the `filename` parameter.
96+
97+
```javascript
98+
const audioConfig = sdk.AudioConfig.fromWavFileInput("YourAudioFile.wav");
99+
const speechConfig = sdk.SpeechConfig.fromSubscription(speechConfig, audioConfig);
100+
```
101+
102+
## Recognize speech
103+
104+
The [Recognizer class](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest) for the Speech SDK for C# exposes a few methods that you can use for speech recognition.
105+
106+
* Single-shot recognition (async) - Performs recognition in a non-blocking (asynchronous) mode. This will recognize a single utterance. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed.
107+
* Continuous recognition (async) - Asynchronously initiates continuous recognition operation. The user registers to events and handles various application state. To stop asynchronous continuous recognition, call [`stopContinuousRecognitionAsync`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#stopcontinuousrecognitionasync).
108+
109+
> [!NOTE]
110+
> Learn more about how to [choose a speech recognition mode](../../../how-to-choose-recognition-mode.md).
111+
112+
### Single-shot recognition
113+
114+
Here's an example of asynchronous single-shot recognition using [`recognizeOnceAsync`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#recognizeonceasync):
115+
116+
```javascript
117+
recognizer.recognizeOnceAsync(result => {
118+
// Interact with result
119+
});
120+
```
121+
122+
You'll need to write some code to handle the result. This sample evaluates the [`result.reason`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognitionresult?view=azure-node-latest#reason):
123+
124+
* Prints the recognition result: `ResultReason.RecognizedSpeech`
125+
* If there is no recognition match, inform the user: `ResultReason.NoMatch`
126+
* If an error is encountered, print the error message: `ResultReason.Canceled`
127+
128+
```javascript
129+
switch (result.reason) {
130+
case sdk.ResultReason.RecognizedSpeech:
131+
console.log(`RECOGNIZED: Text=${result.Text}`);
132+
console.log(" Intent not recognized.");
133+
break;
134+
case sdk.ResultReason.NoMatch:
135+
console.log("NOMATCH: Speech could not be recognized.");
136+
break;
137+
case sdk.ResultReason.Canceled:
138+
const cancellation = sdk.CancellationDetails.fromResult(result);
139+
console.log(`CANCELED: Reason=${cancellation.Reason}`);
140+
141+
if (cancellation.Reason == sdk.CancellationReason.Error) {
142+
console.log(`CANCELED: ErrorCode=${cancellation.ErrorCode}`);
143+
console.log(`CANCELED: ErrorDetails=${cancellation.ErrorDetails}`);
144+
console.log("CANCELED: Did you update the subscription info?");
145+
}
146+
break;
147+
}
148+
}
149+
```
150+
151+
### Continuous recognition
152+
153+
Continuous recognition is a bit more involved than single-shot recognition. It requires you to subscribe to the `Recognizing`, `Recognized`, and `Canceled` events to get the recognition results. To stop recognition, you must call [`stopContinuousRecognitionAsync`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#stopcontinuousrecognitionasync). Here's an example of how continuous recognition is performed on an audio input file.
154+
155+
Let's start by defining the input and initializing a [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest):
156+
157+
```javascript
158+
const recognizer = new sdk.SpeechRecognizer(speechConfig);
159+
```
160+
161+
We'll subscribe to the events sent from the [`SpeechRecognizer`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest).
162+
163+
* [`recognizing`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#recognizing): Signal for events containing intermediate recognition results.
164+
* [`recognized`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#recognized): Signal for events containing final recognition results (indicating a successful recognition attempt).
165+
* [`sessionStopped`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#sessionstopped): Signal for events indicating the end of a recognition session (operation).
166+
* [`canceled`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#canceled): Signal for events containing canceled recognition results (indicating a recognition attempt that was canceled as a result or a direct cancellation request or, alternatively, a transport or protocol failure).
167+
168+
```javascript
169+
recognizer.Recognizing = (s, e) => {
170+
console.log(`RECOGNIZING: Text=${e.Result.Text}`);
171+
};
172+
173+
recognizer.recognized = (s, e) => {
174+
if (e.Result.Reason == sdk.ResultReason.RecognizedSpeech) {
175+
console.log(`RECOGNIZED: Text=${e.Result.Text}`);
176+
}
177+
else if (e.Result.Reason == sdk.ResultReason.NoMatch) {
178+
console.log("NOMATCH: Speech could not be recognized.");
179+
}
180+
};
181+
182+
recognizer.Canceled = (s, e) => {
183+
console.log(`CANCELED: Reason=${e.Reason}`);
184+
185+
if (e.Reason == sdk.CancellationReason.Error) {
186+
console.log(`"CANCELED: ErrorCode=${e.ErrorCode}`);
187+
console.log(`"CANCELED: ErrorDetails=${e.ErrorDetails}`);
188+
console.log("CANCELED: Did you update the subscription info?");
189+
}
190+
191+
recognizer.stopContinuousRecognitionAsync();
192+
};
193+
194+
recognizer.SessionStopped = (s, e) => {
195+
console.log("\n Session stopped event.");
196+
recognizer.stopContinuousRecognitionAsync();
197+
};
198+
```
199+
200+
With everything set up, we can call [`stopContinuousRecognitionAsync`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechrecognizer?view=azure-node-latest#stopcontinuousrecognitionasync).
201+
202+
```javascript
203+
// Starts continuous recognition. Uses stopContinuousRecognitionAsync() to stop recognition.
204+
recognizer.startContinuousRecognitionAsync();
205+
206+
// Something later can call, stops recognition.
207+
// recognizer.StopContinuousRecognitionAsync();
208+
```
209+
210+
### Dictation mode
211+
212+
When using continuous recognition, you can enable dictation processing by using the corresponding "enable dictation" function. This mode will cause the speech config instance to interpret word descriptions of sentence structures such as punctuation. For example, the utterance "Do you live in town question mark" would be interpreted as the text "Do you live in town?".
213+
214+
To enable dictation mode, use the [`enableDictation`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest#enabledictation--) method on your [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest).
215+
216+
```javascript
217+
speechConfig.enableDictation();
218+
```
219+
220+
## Change source language
221+
222+
A common task for speech recognition is specifying the input (or source) language. Let's take a look at how you would change the input language to Italian. In your code, find your [`SpeechConfig`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest), then add this line directly below it.
223+
224+
```javascript
225+
speechConfig.speechRecognitionLanguage = "it-IT";
226+
```
227+
228+
The [`speechRecognitionLanguage`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/speechconfig?view=azure-node-latest#speechrecognitionlanguage) property expects a language-locale format string. You can provide any value in the **Locale** column in the list of supported [locales/languages](../../../language-support.md).
229+
230+
## Improve recognition accuracy
231+
232+
There are a few ways to improve recognition accuracy with the Speech SDK. Let's take a look at Phrase Lists. Phrase Lists are used to identify known phrases in audio data, like a person's name or a specific location. Single words or complete phrases can be added to a Phrase List. During recognition, an entry in a phrase list is used if an exact match for the entire phrase is included in the audio. If an exact match to the phrase is not found, recognition is not assisted.
233+
234+
> [!IMPORTANT]
235+
> The Phrase List feature is only available in English.
236+
237+
To use a phrase list, first create a [`PhraseListGrammar`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/phraselistgrammar?view=azure-node-latest) object, then add specific words and phrases with [`addPhrase`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/phraselistgrammar?view=azure-node-latest#addphrase-string-).
238+
239+
Any changes to [`PhraseListGrammar`](https://docs.microsoft.com/javascript/api/microsoft-cognitiveservices-speech-sdk/phraselistgrammar?view=azure-node-latest) take effect on the next recognition or after a reconnection to the Speech service.
240+
241+
```javascript
242+
const phraseList = sdk.PhraseListGrammar.fromRecognizer(recognizer);
243+
phraseList.addPhrase("Supercalifragilisticexpialidocious");
244+
```
245+
246+
If you need to clear your phrase list:
247+
248+
```javascript
249+
phraseList.clear();
250+
```
251+
252+
### Other options to improve recognition accuracy
253+
254+
Phrase lists are only one option to improve recognition accuracy. You can also:
255+
256+
* [Improve accuracy with Custom Speech](../../../how-to-custom-speech.md)
257+
* [Improve accuracy with tenant models](../../../tutorial-tenant-model.md)

articles/cognitive-services/Speech-Service/speech-to-text-basics.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: quickstart
11-
ms.date: 03/13/2020
11+
ms.date: 04/14/2020
1212
ms.author: dapine
13-
zone_pivot_groups: programming-languages-set-two
13+
zone_pivot_groups: programming-languages-set-sixteen
1414
---
1515

1616
# Learn the basics of speech recognition
@@ -33,6 +33,10 @@ One of the core features of the Speech service is the ability to recognize and t
3333
[!INCLUDE [Java Basics include](includes/how-to/speech-to-text-basics/speech-to-text-basics-java.md)]
3434
::: zone-end
3535

36+
::: zone pivot="programming-language-javascript"
37+
[!INCLUDE [JavaScript Basics include](includes/how-to/speech-to-text-basics/speech-to-text-basics-javascript.md)]
38+
::: zone-end
39+
3640
::: zone pivot="programming-language-python"
3741
[!INCLUDE [Python Basics include](./includes/how-to/speech-to-text-basics/speech-to-text-basics-python.md)]
3842
::: zone-end

articles/zone-pivot-groups.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -278,5 +278,21 @@ groups:
278278
title: C#
279279
- id: programming-language-java
280280
title: Java
281+
- id: programming-language-more
282+
title: More languages...
283+
- id: programming-languages-set-sixteen
284+
title: Programming languages
285+
prompt: Choose a programming language
286+
pivots:
287+
- id: programming-language-csharp
288+
title: C#
289+
- id: programming-language-cpp
290+
title: C++
291+
- id: programming-language-java
292+
title: Java
293+
- id: programming-language-javascript
294+
title: JavaScript
295+
- id: programming-language-python
296+
title: Python
281297
- id: programming-language-more
282298
title: More languages...

0 commit comments

Comments
 (0)