Skip to content

Commit b235c81

Browse files
Merge pull request #100784 from IEvangelist/speechModes
[CogSvcs] reco modes doc
2 parents 08ad65b + f8338bb commit b235c81

File tree

2 files changed

+211
-0
lines changed

2 files changed

+211
-0
lines changed
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
---
2+
title: Choose a speech recognition mode with the Speech SDK
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to choose the best recognition mode when using the Speech SDK.
5+
services: cognitive-services
6+
author: IEvangelist
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: speech-service
10+
ms.topic: conceptual
11+
ms.date: 01/13/2020
12+
ms.author: dapine
13+
zone_pivot_groups: programming-languages-set-two
14+
---
15+
16+
# Choose a speech recognition mode
17+
18+
When considering speech-to-text recognition operations, the [Speech SDK](speech-sdk.md) provides multiple modes for processing speech. Conceptually, sometimes called the *recognition mode*. This article compares the various recognition modes.
19+
20+
## Recognize once
21+
22+
If you want to process each utterance one "sentence" at a time, use the "recognize once" function. This method will detect a recognized utterance from the input starting at the beginning of detected speech until the next pause. Usually, a pause marks the end of a sentence or line-of-thought.
23+
24+
At the end of one recognized utterance, the service stops processing audio from that request. The maximum limit for recognition is a sentence duration of 20 seconds.
25+
26+
::: zone pivot="programming-language-csharp"
27+
28+
For more information on using the `RecognizeOnceAsync` function, see the [.NET Speech SDK docs](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechrecognizer.recognizeonceasync?view=azure-dotnet#Microsoft_CognitiveServices_Speech_SpeechRecognizer_RecognizeOnceAsync).
29+
30+
```csharp
31+
var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
32+
```
33+
34+
::: zone-end
35+
::: zone pivot="programming-language-cpp"
36+
37+
For more information on using the `RecognizeOnceAsync` function, see the [C++ Speech SDK docs](https://docs.microsoft.com/cpp/cognitive-services/speech/asyncrecognizer#recognizeonceasync).
38+
39+
```cpp
40+
auto result = recognize->RecognizeOnceAsync().get();
41+
```
42+
43+
::: zone-end
44+
::: zone pivot="programming-language-java"
45+
46+
For more information on using the `recognizeOnceAsync` function, see the [Java Speech SDK docs](https://docs.microsoft.com/java/api/com.microsoft.cognitiveservices.speech.SpeechRecognizer.recognizeOnceAsync?view=azure-java-stable).
47+
48+
```java
49+
SpeechRecognitionResult result = recognizer.recognizeOnceAsync().get();
50+
```
51+
52+
::: zone-end
53+
::: zone pivot="programming-language-python"
54+
55+
For more information on using the `recognize_once` function, see the [Python Speech SDK docs](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.speechrecognizer?view=azure-python#recognize-once------azure-cognitiveservices-speech-speechrecognitionresult).
56+
57+
```python
58+
result = speech_recognizer.recognize_once()
59+
```
60+
61+
::: zone-end
62+
::: zone pivot="programming-language-more"
63+
64+
For additional languages, see the [Speech SDK reference docs](speech-to-text.md#speech-sdk-reference-docs).
65+
66+
::: zone-end
67+
68+
## Continuous
69+
70+
If you need long-running recognition, use the start and corresponding stop functions for continuous recognition. The start function will start and continue processing all utterances until you invoke the stop function, or until too much time in silence has passed. When using the continuous mode, be sure to register to the various events that will fire upon occurrence. For example, the "recognized" event fires when speech recognition occurs. You need to have an event handler in place to handle recognition. A limit of 10 minutes of total speech recognition time, per session is enforced by the Speech service.
71+
72+
::: zone pivot="programming-language-csharp"
73+
74+
```csharp
75+
// Subscribe to event
76+
recognizer.Recognized += (s, e) =>
77+
{
78+
if (e.Result.Reason == ResultReason.RecognizedSpeech)
79+
{
80+
// Do something with the recognized text
81+
// e.Result.Text
82+
}
83+
};
84+
85+
// Start continuous speech recognition
86+
await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);
87+
88+
// Stop continuous speech recognition
89+
await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
90+
```
91+
92+
::: zone-end
93+
::: zone pivot="programming-language-cpp"
94+
95+
```cpp
96+
// Connect to event
97+
recognizer->Recognized.Connect([] (const SpeechRecognitionEventArgs& e)
98+
{
99+
if (e.Result->Reason == ResultReason::RecognizedSpeech)
100+
{
101+
// Do something with the recognized text
102+
// e.Result->Text
103+
}
104+
});
105+
106+
// Start continuous speech recognition
107+
recognizer->StartContinuousRecognitionAsync().get();
108+
109+
// Stop continuous speech recognition
110+
recognizer->StopContinuousRecognitionAsync().get();
111+
```
112+
113+
::: zone-end
114+
::: zone pivot="programming-language-java"
115+
116+
```java
117+
recognizer.recognized.addEventListener((s, e) -> {
118+
if (e.getResult().getReason() == ResultReason.RecognizedSpeech) {
119+
// Do something with the recognized text
120+
// e.getResult().getText()
121+
}
122+
});
123+
124+
// Start continuous speech recognition
125+
recognizer.startContinuousRecognitionAsync().get();
126+
127+
// Stop continuous speech recognition
128+
recognizer.stopContinuousRecognitionAsync().get();
129+
```
130+
131+
::: zone-end
132+
::: zone pivot="programming-language-python"
133+
134+
```python
135+
def recognized_cb(evt):
136+
if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
137+
# Do something with the recognized text
138+
# evt.result.text
139+
140+
speech_recognizer.recognized.connect(recognized_cb)
141+
142+
# Start continuous speech recognition
143+
speech_recognizer.start_continuous_recognition()
144+
145+
# Stop continuous speech recognition
146+
speech_recognizer.stop_continuous_recognition()
147+
```
148+
149+
::: zone-end
150+
::: zone pivot="programming-language-more"
151+
152+
For additional languages, see the [Speech SDK reference docs](speech-to-text.md#speech-sdk-reference-docs).
153+
154+
::: zone-end
155+
156+
## Dictation
157+
158+
When using continuous recognition, you can enable dictation processing by using the corresponding "enable dictation" function. This mode will cause the speech config instance to interpret word descriptions of sentence structures such as punctuation. For example, the utterance "Do you live in town question mark" would be interpreted as the text "Do you live in town?".
159+
160+
::: zone pivot="programming-language-csharp"
161+
162+
For more information on using the `EnableDictation` function, see the [.NET Speech SDK docs](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechconfig.enabledictation?view=azure-dotnet#Microsoft_CognitiveServices_Speech_SpeechConfig_EnableDictation).
163+
164+
```csharp
165+
// Enable diction
166+
SpeechConfig.EnableDictation();
167+
```
168+
169+
::: zone-end
170+
::: zone pivot="programming-language-cpp"
171+
172+
For more information on using the `EnableDictation` function, see the [C++ Speech SDK docs](https://docs.microsoft.com/cpp/cognitive-services/speech/speechconfig#enabledictation).
173+
174+
```cpp
175+
// Enable diction
176+
SpeechConfig->EnableDictation();
177+
```
178+
179+
::: zone-end
180+
::: zone pivot="programming-language-java"
181+
182+
For more information on using the `enableDictation` function, see the [Java Speech SDK docs](https://docs.microsoft.com/java/api/com.microsoft.cognitiveservices.speech.SpeechConfig.enableDictation?view=azure-java-stable).
183+
184+
```java
185+
// Enable diction
186+
SpeechConfig.enableDictation();
187+
```
188+
189+
::: zone-end
190+
::: zone pivot="programming-language-python"
191+
192+
For more information on using the `enable_dictation` function, see the [Python Speech SDK docs](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.speechconfig?view=azure-python#enable-dictation--).
193+
194+
```python
195+
# Enable diction
196+
SpeechConfig.enable_dictation()
197+
```
198+
199+
::: zone-end
200+
::: zone pivot="programming-language-more"
201+
202+
For additional languages, see the [Speech SDK reference docs](speech-to-text.md#speech-sdk-reference-docs).
203+
204+
::: zone-end
205+
206+
## Next steps
207+
208+
> [!div class="nextstepaction"]
209+
> [Explore additional Speech SDK samples on GitHub](https://aka.ms/csspeech/samples)

articles/cognitive-services/Speech-Service/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@
2727
items:
2828
- name: Change speech recognition source language
2929
href: how-to-specify-source-language.md
30+
- name: Choose speech recognition mode
31+
href: how-to-choose-recognition-mode.md
3032
- name: Improve accuracy with Custom Speech
3133
href: how-to-custom-speech.md
3234
- name: Improve accuracy with Phrase Lists

0 commit comments

Comments
 (0)