You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-cpp.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -97,7 +97,7 @@ void synthesizeSpeech()
97
97
98
98
Run the program, and a synthesized `.wav` file is written to the location you specified. This is a good example of the most basic usage, but next you look at customizing output and handling the output response as an in-memory stream for working with custom scenarios.
99
99
100
-
###Synthesize to speaker output
100
+
## Synthesize to speaker output
101
101
102
102
In some cases, you may want to directly output synthesized speech directly to a speaker. To do this, simply omit the `AudioConfig` param when creating the `SpeechSynthesizer` in the example above. This outputs to the current active output device.
103
103
@@ -121,7 +121,7 @@ For many scenarios in speech application development, you likely need the result
121
121
It's simple to make this change from the previous example. First, remove the `AudioConfig`, as you will manage the output behavior manually from this point onward for increased control. Then pass `NULL` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
122
122
123
123
> ![NOTE]
124
-
> Passing `NULL` for the `AudioConfig`, rather than omitting it like in the speaker output example
124
+
> Passing `NULL` for the `AudioConfig`, rather than omitting it like in the speaker output example
125
125
> above, will not play the audio by default on the current active output device.
126
126
127
127
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/cpp/cognitive-services/speech/speechsynthesisresult) variable. The `GetAudioData` getter returns a `byte []` of the output data. You can work with this `byte []` manually, or you can use the [`AudioDataStream`](https://docs.microsoft.com/cpp/cognitive-services/speech/audiodatastream) class to manage the in-memory stream. In this example you use the `AudioDataStream.FromResult()` static function to get a stream from the result.
@@ -218,7 +218,7 @@ The output works, but there a few simple additional changes you can make to help
218
218
</speak>
219
219
```
220
220
221
-
###Neural voices
221
+
## Neural voices
222
222
223
223
Neural voices are speech synthesis algorithms powered by deep neural networks. When using a neural voice, synthesized speech is nearly indistinguishable from the human recordings. With the human-like natural prosody and clear articulation of words, neural voices significantly reduce listening fatigue when users interact with AI systems.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-csharp.md
Run the program, and a synthesized `.wav` file is written to the location you specified. This is a good example of the most basic usage, but next you look at customizing output and handling the output response as an in-memory stream for working with custom scenarios.
94
94
95
-
###Synthesize to speaker output
95
+
## Synthesize to speaker output
96
96
97
97
In some cases, you may want to directly output synthesized speech directly to a speaker. To do this, simply omit the `AudioConfig` param when creating the `SpeechSynthesizer` in the example above. This outputs to the current active output device.
98
98
@@ -116,7 +116,7 @@ For many scenarios in speech application development, you likely need the result
116
116
It's simple to make this change from the previous example. First, remove the `AudioConfig` block, as you will manage the output behavior manually from this point onward for increased control. Then pass `null` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
117
117
118
118
> ![NOTE]
119
-
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example
119
+
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example
120
120
> above, will not play the audio by default on the current active output device.
121
121
122
122
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechsynthesisresult?view=azure-dotnet) variable. The `AudioData` property contains a `byte []` of the output data. You can work with this `byte []` manually, or you can use the [`AudioDataStream`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.audiodatastream?view=azure-dotnet) class to manage the in-memory stream. In this example you use the `AudioDataStream.FromResult()` static function to get a stream from the result.
@@ -211,7 +211,7 @@ The output works, but there a few simple additional changes you can make to help
211
211
</speak>
212
212
```
213
213
214
-
###Neural voices
214
+
## Neural voices
215
215
216
216
Neural voices are speech synthesis algorithms powered by deep neural networks. When using a neural voice, synthesized speech is nearly indistinguishable from the human recordings. With the human-like natural prosody and clear articulation of words, neural voices significantly reduce listening fatigue when users interact with AI systems.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-java.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -85,7 +85,7 @@ public static void main(String[] args) {
85
85
86
86
Run the program, and a synthesized `.wav` file is written to the location you specified. This is a good example of the most basic usage, but next you look at customizing output and handling the output response as an in-memory stream for working with custom scenarios.
87
87
88
-
###Synthesize to speaker output
88
+
## Synthesize to speaker output
89
89
90
90
In some cases, you may want to directly output synthesized speech directly to a speaker. To do this, instantiate the `AudioConfig` using the `fromDefaultSpeakerOutput()` static function. This outputs to the current active output device.
91
91
@@ -110,7 +110,7 @@ For many scenarios in speech application development, you likely need the result
110
110
It's simple to make this change from the previous example. First, remove the `AudioConfig` block, as you will manage the output behavior manually from this point onward for increased control. Then pass `null` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
111
111
112
112
> ![NOTE]
113
-
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example
113
+
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example
114
114
> above, will not play the audio by default on the current active output device.
115
115
116
116
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/java/api/com.microsoft.cognitiveservices.speech.speechsynthesisresult?view=azure-java-stable) variable. The `SpeechSynthesisResult.getAudioData()` function returns a `byte []` of the output data. You can work with this `byte []` manually, or you can use the [`AudioDataStream`](https://docs.microsoft.com/java/api/com.microsoft.cognitiveservices.speech.audiodatastream?view=azure-java-stable) class to manage the in-memory stream. In this example you use the `AudioDataStream.fromResult()` static function to get a stream from the result.
@@ -217,7 +217,7 @@ The output works, but there a few simple additional changes you can make to help
217
217
</speak>
218
218
```
219
219
220
-
###Neural voices
220
+
## Neural voices
221
221
222
222
Neural voices are speech synthesis algorithms powered by deep neural networks. When using a neural voice, synthesized speech is nearly indistinguishable from the human recordings. With the human-like natural prosody and clear articulation of words, neural voices significantly reduce listening fatigue when users interact with AI systems.
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-python.md
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,7 +36,8 @@ from azure.cognitiveservices.speech.audio import AudioOutputConfig
36
36
To call the Speech service using the Speech SDK, you need to create a [`SpeechConfig`](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.speechconfig?view=azure-python). This class includes information about your subscription, like your key and associated region, endpoint, host, or authorization token.
37
37
38
38
> [!NOTE]
39
-
> Regardless of whether you're performing speech recognition, speech synthesis, translation, or intent recognition, you'll always create a configuration.
39
+
> Regardless of whether you're performing speech recognition, speech synthesis, translation, or intent
40
+
> recognition, you'll always create a configuration.
40
41
41
42
There are a few ways that you can initialize a [`SpeechConfig`](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.speechconfig?view=azure-python):
42
43
@@ -70,7 +71,7 @@ synthesizer.speak_text_async("A simple test to write to a file.")
70
71
71
72
Run the program, and a synthesized `.wav` file is written to the location you specified. This is a good example of the most basic usage, but next you look at customizing output and handling the output response as an in-memory stream for working with custom scenarios.
72
73
73
-
###Synthesize to speaker output
74
+
## Synthesize to speaker output
74
75
75
76
In some cases, you may want to directly output synthesized speech directly to a speaker. To do this, use the example in the previous section, but change the `AudioOutputConfig` by removing the `filename` param, and set `use_default_speaker=True`. This outputs to the current active output device.
76
77
@@ -89,7 +90,7 @@ For many scenarios in speech application development, you likely need the result
89
90
It's simple to make this change from the previous example. First, remove the `AudioConfig`, as you will manage the output behavior manually from this point onward for increased control. Then pass `None` for the `AudioConfig` in the `SpeechSynthesizer` constructor.
90
91
91
92
> ![NOTE]
92
-
> Passing `None` for the `AudioConfig`, rather than omitting it like in the speaker output example
93
+
> Passing `None` for the `AudioConfig`, rather than omitting it like in the speaker output example
93
94
> above, will not play the audio by default on the current active output device.
94
95
95
96
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.speechsynthesisresult?view=azure-python) variable. The `audio_data` property contains a `bytes` object of the output data. You can work with this object manually, or you can use the [`AudioDataStream`](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.audiodatastream?view=azure-python) class to manage the in-memory stream. In this example you use the `AudioDataStream` constructor to get a stream from the result.
@@ -171,7 +172,7 @@ The output works, but there a few simple additional changes you can make to help
171
172
</speak>
172
173
```
173
174
174
-
###Neural voices
175
+
## Neural voices
175
176
176
177
Neural voices are speech synthesis algorithms powered by deep neural networks. When using a neural voice, synthesized speech is nearly indistinguishable from the human recordings. With the human-like natural prosody and clear articulation of words, neural voices significantly reduce listening fatigue when users interact with AI systems.
description: Learn how to use the SPX command line tool to work with the Speech SDK with no code and minimal setup.
5
+
services: cognitive-services
6
+
author: trevorbye
7
+
manager: nitinme
8
+
ms.service: cognitive-services
9
+
ms.subservice: speech-service
10
+
ms.topic: quickstart
11
+
ms.date: 04/04/2020
12
+
ms.author: trbye
13
+
---
14
+
15
+
# Learn the basics of SPX
16
+
17
+
## Prerequisites
18
+
19
+
The only prerequisite is an Azure Speech subscription. See the [guide](get-started.md#new-resource) on creating a new subscription if you don't already have one.
20
+
21
+
## Download and install
22
+
23
+
## Create subscription config
24
+
25
+
To start using SPX, you first need to enter your Speech subscription key and region information. See the [region support](https://docs.microsoft.com/azure/cognitive-services/speech-service/regions#speech-sdk) page to find your region identifier. Once you have your subscription key and region identifier (ex. `eastus`, `westus`), run the following commands.
26
+
27
+
```shell
28
+
spx config @key --set YOUR-SUBSCRIPTION-KEY
29
+
spx config @region --set YOUR-REGION-ID
30
+
```
31
+
32
+
Your subscription authentication is now stored for future SPX requests. If you need to remove either of these stored values, run `spx config @region --clear` or `spx config @key --clear`.
0 commit comments