You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/active-directory-b2c/partner-eid-me.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -98,7 +98,7 @@ To configure your tenant application as an eID-ME relying party in eID-Me, suppl
98
98
| Application privacy policy URL| Appears to the end user|
99
99
100
100
>[!NOTE]
101
-
>When the relying party is configurede, ID-Me provides a Client ID and a Client Secret. Note the Client ID and Client Secret to configure the identity provider (IdP) in Azure AD B2C.
101
+
>When the relying party is configured, ID-Me provides a Client ID and a Client Secret. Note the Client ID and Client Secret to configure the identity provider (IdP) in Azure AD B2C.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/audio-processing-overview.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: hasyashah
6
6
manager: nitinme
7
7
ms.service: azure-ai-speech
8
8
ms.topic: overview
9
-
ms.date: 09/07/2022
9
+
ms.date: 1/18/2024
10
10
ms.author: hasshah
11
11
ms.custom: ignite-fall-2021
12
12
---
@@ -19,11 +19,11 @@ The Microsoft Audio Stack is a set of enhancements optimized for speech processi
19
19
***Beamforming** - Localize the origin of sound and optimize the audio signal using multiple microphones.
20
20
***Dereverberation** - Reduce the reflections of sound from surfaces in the environment.
21
21
***Acoustic echo cancellation** - Suppress audio being played out of the device while microphone input is active.
22
-
***Automatic gain control** - Dynamically adjust the person’s voice level to account for soft speakers, long distances, or non-calibrated microphones.
22
+
***Automatic gain control** - Dynamically adjust the person’s voice level to account for soft speakers, long distances, or noncalibrated microphones.
23
23
24
24
[](media/audio-processing/mas-block-diagram.png#lightbox)
25
25
26
-
Different scenarios and use-cases can require different optimizations that influence the behavior of the audio processing stack. For example, in telecommunications scenarios such as telephone calls, it is acceptable to have minor distortions in the audio signal after processing has been applied. This is because humans can continue to understand the speech with high accuracy. However, it is unacceptable and disruptive for a person to hear their own voice in an echo. This contrasts with speech processing scenarios, where distorted audio can adversely impact a machine-learned speech recognition model’s accuracy, but it is acceptable to have minor levels of echo residual.
26
+
Different scenarios and use-cases can require different optimizations that influence the behavior of the audio processing stack. For example, in telecommunications scenarios such as telephone calls, it's acceptable to have minor distortions in the audio signal after processing has been applied. This is because humans can continue to understand the speech with high accuracy. However, it's unacceptable and disruptive for a person to hear their own voice in an echo. This contrasts with speech processing scenarios, where distorted audio can adversely affect a machine-learned speech recognition model's accuracy, but it's acceptable to have minor levels of echo residual.
27
27
28
28
Processing is performed fully locally where the Speech SDK is being used. No audio data is streamed to Microsoft’s cloud services for processing by the Microsoft Audio Stack. The only exception to this is for the Conversation Transcription Service, where raw audio is sent to Microsoft’s cloud services for processing.
29
29
@@ -35,7 +35,7 @@ The Microsoft Audio Stack also powers a wide range of Microsoft products:
35
35
36
36
The Speech SDK integrates Microsoft Audio Stack (MAS), allowing any application or product to use its audio processing capabilities on input audio. Some of the key Microsoft Audio Stack features available via the Speech SDK include:
37
37
***Real-time microphone input & file input** - Microsoft Audio Stack processing can be applied to real-time microphone input, streams, and file-based input.
38
-
***Selection of enhancements** - To allow for full control of your scenario, the SDK allows you to disable individual enhancements like dereverberation, noise suppression, automatic gain control, and acoustic echo cancellation. For example, if your scenario does not include rendering output audio that needs to be suppressed from the input audio, you have the option to disable acoustic echo cancellation.
38
+
***Selection of enhancements** - To allow for full control of your scenario, the SDK allows you to disable individual enhancements like dereverberation, noise suppression, automatic gain control, and acoustic echo cancellation. For example, if your scenario doesn't include rendering output audio that needs to be suppressed from the input audio, you have the option to disable acoustic echo cancellation.
39
39
***Custom microphone geometries** - The SDK allows you to provide your own custom microphone geometry information, in addition to supporting preset geometries like linear two-mic, linear four-mic, and circular 7-mic arrays (see more information on supported preset geometries at [Microphone array recommendations](speech-sdk-microphone.md#microphone-geometry)).
40
40
***Beamforming angles** - Specific beamforming angles can be provided to optimize audio input originating from a predetermined location, relative to the microphones.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/audio-processing-speech-sdk.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,10 +6,9 @@ author: hasyashah
6
6
manager: nitinme
7
7
ms.service: azure-ai-speech
8
8
ms.topic: how-to
9
-
ms.date: 09/16/2022
9
+
ms.date: 1/18/2024
10
10
ms.author: hasshah
11
11
ms.devlang: cpp
12
-
# ms.devlang: cpp, csharp, java
13
12
ms.custom: devx-track-csharp, ignite-fall-2021
14
13
---
15
14
@@ -104,7 +103,7 @@ SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioInput);
104
103
## Custom microphone geometry
105
104
106
105
This sample shows how to use MAS with a custom microphone geometry on a specified audio input device. In this example:
107
-
***Enhancement options** - The default enhancements will be applied on the input audio stream.
106
+
***Enhancement options** - The default enhancements are applied on the input audio stream.
108
107
***Custom geometry** - A custom microphone geometry for a 7-microphone array is provided via the microphone coordinates. The units for coordinates are millimeters.
109
108
***Audio input** - The audio input is from a file, where the audio within the file is expected from an audio input device corresponding to the custom geometry specified.
110
109
@@ -172,7 +171,7 @@ SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioInput);
172
171
This sample shows how to use MAS with a custom set of enhancements on the input audio. By default, all enhancements are enabled but there are options to disable dereverberation, noise suppression, automatic gain control, and echo cancellation individually by using `AudioProcessingOptions`.
173
172
174
173
In this example:
175
-
***Enhancement options** - Echo cancellation and noise suppression will be disabled, while all other enhancements remain enabled.
174
+
***Enhancement options** - Echo cancellation and noise suppression are disabled, while all other enhancements remain enabled.
176
175
***Audio input device** - The audio input device is the default microphone of the device.
177
176
178
177
### [C#](#tab/csharp)
@@ -212,11 +211,13 @@ SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioInput);
212
211
## Specify beamforming angles
213
212
214
213
This sample shows how to use MAS with a custom microphone geometry and beamforming angles on a specified audio input device. In this example:
215
-
***Enhancement options** - The default enhancements will be applied on the input audio stream.
214
+
***Enhancement options** - The default enhancements are applied on the input audio stream.
216
215
***Custom geometry** - A custom microphone geometry for a 4-microphone array is provided by specifying the microphone coordinates. The units for coordinates are millimeters.
217
-
***Beamforming angles** - Beamforming angles are specified to optimize for audio originating in that range. The units for angles are degrees. In the sample code below, the start angle is set to 70 degrees and the end angle is set to 110 degrees.
216
+
***Beamforming angles** - Beamforming angles are specified to optimize for audio originating in that range. The units for angles are degrees.
218
217
***Audio input** - The audio input is from a push stream, where the audio within the stream is expected from an audio input device corresponding to the custom geometry specified.
219
218
219
+
In the following code example, the start angle is set to 70 degrees and the end angle is set to 110 degrees.
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/batch-synthesis-properties.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: eric-urban
6
6
manager: nitinme
7
7
ms.service: azure-ai-speech
8
8
ms.topic: how-to
9
-
ms.date: 11/16/2022
9
+
ms.date: 1/18/2024
10
10
ms.author: eur
11
11
---
12
12
@@ -31,7 +31,7 @@ Batch synthesis properties are described in the following table.
31
31
|`description`|The description of the batch synthesis.<br/><br/>This property is optional.|
32
32
|`displayName`|The name of the batch synthesis. Choose a name that you can refer to later. The display name doesn't have to be unique.<br/><br/>This property is required.|
33
33
|`id`|The batch synthesis job ID.<br/><br/>This property is read-only.|
34
-
|`inputs`|The plain text or SSML to be synthesized.<br/><br/>When the `textType` is set to `"PlainText"`, provide plain text as shown here: `"inputs": [{"text": "The rainbow has seven colors."}]`. When the `textType` is set to `"SSML"`, provide text in the [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md) as shown here: `"inputs": [{"text": "<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-JennyNeural'\''>The rainbow has seven colors.</voice></speak>"}]`.<br/><br/>Include up to 1,000 text objects if you want multiple audio output files. Here's example input text that should be synthesized to two audio output files: `"inputs": [{"text": "synthesize this to a file"},{"text": "synthesize this to another file"}]`. However, if the `properties.concatenateResult` property is set to `true`, then each synthesized result will be written to the same audio output file.<br/><br/>You don't need separate text inputs for new paragraphs. Within any of the (up to 1,000) text inputs, you can specify new paragraphs using the "\r\n" (newline) string. Here's example input text with two paragraphs that should be synthesized to the same audio output file: `"inputs": [{"text": "synthesize this to a file\r\nsynthesize this to another paragraph in the same file"}]`<br/><br/>There are no paragraph limits, but keep in mind that the maximum JSON payload size (including all text inputs and other properties) that will be accepted is 500 kilobytes.<br/><br/>This property is required when you create a new batch synthesis job. This property isn't included in the response when you get the synthesis job.|
34
+
|`inputs`|The plain text or SSML to be synthesized.<br/><br/>When the `textType` is set to `"PlainText"`, provide plain text as shown here: `"inputs": [{"text": "The rainbow has seven colors."}]`. When the `textType` is set to `"SSML"`, provide text in the [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md) as shown here: `"inputs": [{"text": "<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-JennyNeural'\''>The rainbow has seven colors.</voice></speak>"}]`.<br/><br/>Include up to 1,000 text objects if you want multiple audio output files. Here's example input text that should be synthesized to two audio output files: `"inputs": [{"text": "synthesize this to a file"},{"text": "synthesize this to another file"}]`. However, if the `properties.concatenateResult` property is set to `true`, then each synthesized result is written to the same audio output file.<br/><br/>You don't need separate text inputs for new paragraphs. Within any of the (up to 1,000) text inputs, you can specify new paragraphs using the "\r\n" (newline) string. Here's example input text with two paragraphs that should be synthesized to the same audio output file: `"inputs": [{"text": "synthesize this to a file\r\nsynthesize this to another paragraph in the same file"}]`<br/><br/>There are no paragraph limits, but the maximum JSON payload size (including all text inputs and other properties) is 500 kilobytes.<br/><br/>This property is required when you create a new batch synthesis job. This property isn't included in the response when you get the synthesis job.|
35
35
|`lastActionDateTime`|The most recent date and time when the `status` property value changed.<br/><br/>This property is read-only.|
36
36
|`outputs.result`|The location of the batch synthesis result files with audio output and logs.<br/><br/>This property is read-only.|
37
37
|`properties`|A defined set of optional batch synthesis configuration settings.|
@@ -44,10 +44,10 @@ Batch synthesis properties are described in the following table.
44
44
|`properties.durationInTicks`|The audio output duration in ticks.<br/><br/>This property is read-only.|
45
45
|`properties.failedAudioCount`|The count of batch synthesis inputs to audio output failed.<br/><br/>This property is read-only.|
46
46
|`properties.outputFormat`|The audio output format.<br/><br/>For information about the accepted values, see [audio output formats](rest-text-to-speech.md#audio-outputs). The default output format is `riff-24khz-16bit-mono-pcm`.|
47
-
|`properties.sentenceBoundaryEnabled`|Determines whether to generate sentence boundary data. This optional `bool` value ("true" or "false") is "false" by default.<br/><br/>If sentence boundary data is requested, then a corresponding `[nnnn].sentence.json` file will be included in the results data ZIP file.|
47
+
|`properties.sentenceBoundaryEnabled`|Determines whether to generate sentence boundary data. This optional `bool` value ("true" or "false") is "false" by default.<br/><br/>If sentence boundary data is requested, then a corresponding `[nnnn].sentence.json` file is included in the results data ZIP file.|
48
48
|`properties.succeededAudioCount`|The count of batch synthesis inputs to audio output succeeded.<br/><br/>This property is read-only.|
49
49
|`properties.timeToLive`|A duration after the synthesis job is created, when the synthesis results will be automatically deleted. The value is an ISO 8601 encoded duration. For example, specify `PT12H` for 12 hours. This optional setting is `P31D` (31 days) by default. The maximum time to live is 31 days. The date and time of automatic deletion (for synthesis jobs with a status of "Succeeded" or "Failed") is equal to the `lastActionDateTime` + `timeToLive` properties.<br/><br/>Otherwise, you can call the [delete](./batch-synthesis.md#delete-batch-synthesis) synthesis method to remove the job sooner.|
50
-
|`properties.wordBoundaryEnabled`|Determines whether to generate word boundary data. This optional `bool` value ("true" or "false") is "false" by default.<br/><br/>If word boundary data is requested, then a corresponding `[nnnn].word.json` file will be included in the results data ZIP file.|
50
+
|`properties.wordBoundaryEnabled`|Determines whether to generate word boundary data. This optional `bool` value ("true" or "false") is "false" by default.<br/><br/>If word boundary data is requested, then a corresponding `[nnnn].word.json` file is included in the results data ZIP file.|
51
51
|`status`|The batch synthesis processing status.<br/><br/>The status should progress from "NotStarted" to "Running", and finally to either "Succeeded" or "Failed".<br/><br/>This property is read-only.|
52
52
|`synthesisConfig`|The configuration settings to use for batch synthesis of plain text.<br/><br/>This property is only applicable when `textType` is set to `"PlainText"`.|
53
53
|`synthesisConfig.pitch`|The pitch of the audio output.<br/><br/>For information about the accepted values, see the [adjust prosody](speech-synthesis-markup-voice.md#adjust-prosody) table in the Speech Synthesis Markup Language (SSML) documentation. Invalid values are ignored.<br/><br/>This optional property is only applicable when `textType` is set to `"PlainText"`.|
@@ -73,7 +73,7 @@ The latency for batch synthesis is as follows (approximately):
73
73
74
74
### Best practices
75
75
76
-
When considering batch synthesis for your application, it's recommended to assess whether the latency meets your requirements. If the latency aligns with your desired performance, batch synthesis can be a suitable choice. However, if the latency does not meet your needs, you might consider using real-time API.
76
+
When considering batch synthesis for your application, it's recommended to assess whether the latency meets your requirements. If the latency aligns with your desired performance, batch synthesis can be a suitable choice. However, if the latency doesn't meet your needs, you might consider using real-time API.
77
77
78
78
## HTTP status codes
79
79
@@ -100,7 +100,7 @@ Here are examples that can result in the 400 error:
100
100
- The number of requested text inputs exceeded the limit of 1,000.
101
101
- The `top` query parameter exceeded the limit of 100.
102
102
- You tried to use an invalid deployment ID or a custom voice that isn't successfully deployed. Make sure the Speech resource has access to the custom voice, and the custom voice is successfully deployed. You must also ensure that the mapping of `{"your-custom-voice-name": "your-deployment-ID"}` is correct in your batch synthesis request.
103
-
- You tried to delete a batch synthesis job that hasn't started or hasn't completed running. You can only delete batch synthesis jobs that have a status of "Succeeded" or "Failed".
103
+
- You tried to delete a batch synthesis job that isn't started or hasn't completed running. You can only delete batch synthesis jobs that have a status of "Succeeded" or "Failed".
104
104
- You tried to use a *F0* Speech resource, but the region only supports the *Standard* Speech resource pricing tier.
105
105
- You tried to create a new batch synthesis job that would exceed the limit of 200 active jobs. Each Speech resource can have up to 200 batch synthesis jobs that don't have a status of "Succeeded" or "Failed".
106
106
@@ -132,9 +132,9 @@ Here's an example request that results in an HTTP 400 error, because the `top` q
132
132
curl -v -X GET "https://YourSpeechRegion.customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis?skip=0&top=200" -H "Ocp-Apim-Subscription-Key: YourSpeechKey"
133
133
```
134
134
135
-
In this case, the response headers will include `HTTP/1.1 400 Bad Request`.
135
+
In this case, the response headers include `HTTP/1.1 400 Bad Request`.
136
136
137
-
The response body will resemble the following JSON example:
137
+
The response body resembles the following JSON example:
0 commit comments