You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/includes/how-to/text-to-speech-basics/text-to-speech-basics-csharp.md
awaitsynthesizer.SpeakTextAsync("A simple test to write to a file.");
104
+
awaitsynthesizer.SpeakTextAsync("Synthesizing directly to speaker output.");
106
105
}
107
106
```
108
107
@@ -120,23 +119,21 @@ It's simple to make this change from the previous example. First, remove the `Au
120
119
> Passing `null` for the `AudioConfig`, rather than omitting it like in the speaker output example
121
120
> above, will not play the audio by default on the current active output device.
122
121
123
-
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechsynthesisresult?view=azure-dotnet) variable. The `AudioData` property contains a `byte []` of the output data. Simply grab the `byte []`and write it to a new `MemoryStream`. From here you can implement any custom behavior using the resulting output, but in this example you write to a file manually.
122
+
This time, you save the result to a [`SpeechSynthesisResult`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechsynthesisresult?view=azure-dotnet) variable. The `AudioData` property contains a `byte []` of the output data. You can work with this `byte []`manually, or you can use the [`AudioDataStream`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.audiodatastream?view=azure-dotnet) class to manage the in-memory stream. In this example you use the `AudioDataStream.FromResult()` static function to get a stream from the result.
From here you can implement any custom behavior using the resulting `stream` object.
136
+
140
137
## Customize audio format
141
138
142
139
The following section shows how to customize audio output attributes including:
@@ -147,60 +144,25 @@ The following section shows how to customize audio output attributes including:
147
144
148
145
To change the audio format, you use the `SetSpeechSynthesisOutputFormat()` function on the `SpeechConfig` object. This function expects an `enum` of type [`SpeechSynthesisOutputFormat`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechsynthesisoutputformat?view=azure-dotnet), which you use to select the output format. See the reference docs for a [list of audio formats](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.speechsynthesisoutputformat?view=azure-dotnet) that are available.
149
146
150
-
There are various options for different file types depending on your requirements. In this example, you specify a high-fidelity raw format `Raw24Khz16BitMonoPcm`, which requires writing the `.wav` headers manually. If you choose a format such as `Audio24Khz96KBitRateMonoMp3`, you would not need to write headers.
147
+
There are various options for different file types depending on your requirements. Note that by definition, raw formats like `Raw24Khz16BitMonoPcm` do not include audio headers. Use raw formats only when you know your downstream implementation can decode a raw bitstream, or if you plan on manually building headers based on bit-depth, sample-rate, number of channels, etc.
151
148
152
-
First, create a function `WriteWavHeader()` to write the necessary audio metadata to the front of your `MemoryStream`. Since `Raw24Khz16BitMonoPcm` is a raw audio format, you need to write standardized audio file headers so that other software knows information like the number of channels, sample rate, and bit depth when your file is played.
Next, set the `SpeechSynthesisOutputFormat` on the `SpeechConfig` object. Similar to the example in the previous section, you write the `byte []` from the result to a `MemoryStream`, but first you must write the custom `.wav` headers for the chosen file type. Use the function you created above, passing the memory stream by reference. For the other params, the number of **channels** is 1 (mono), the **bit-depth** is 16, the **sample-rate** is 24,000 (24Khz), and the **total samples** is the length of the raw `byte []` from the `SpeechSynthesisResult`.
149
+
In this example, you specify a high-fidelity RIFF format `Riff24Khz16BitMonoPcm` by setting the `SpeechSynthesisOutputFormat` on the `SpeechConfig` object. Similar to the example in the previous section, you use [`AudioDataStream`](https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.audiodatastream?view=azure-dotnet) to get an in-memory stream of the result, and then write it to a file.
Running your program again will write a custom-formatted `.wav` file to the specified path.
165
+
Running your program again will write a `.wav` file to the specified path.
204
166
205
167
## Use SSML to customize speech characteristics
206
168
@@ -217,7 +179,7 @@ First, create a new XML file for the SSML config in your root project directory,
217
179
</speak>
218
180
```
219
181
220
-
Next, you need to change the speech synthesis request to reference your XML file. The request is mostly the same, but instead of using the `SpeakTextAsync()` function, you use `SpeakSsmlAsync()`. This function expects an XML string, so you first load your SSML config as a string using `XDocument.Load()`. From here, the result object is exactly the same as previous examples.
182
+
Next, you need to change the speech synthesis request to reference your XML file. The request is mostly the same, but instead of using the `SpeakTextAsync()` function, you use `SpeakSsmlAsync()`. This function expects an XML string, so you first load your SSML config as a string using `File.ReadAllText()`. From here, the result object is exactly the same as previous examples.
221
183
222
184
> [!NOTE]
223
185
> If you're using Visual Studio, your build config likely will not find your XML file by default. To fix this, right click the XML file and
@@ -229,14 +191,11 @@ public static async Task SynthesizeAudioAsync()
0 commit comments