Skip to content

Commit e8affbf

Browse files
authored
Merge pull request #241892 from eric-urban/eur/prep-azure-ai-speech
Speech in Azure AI prep
2 parents 4fea7b7 + e5f93a0 commit e8affbf

File tree

54 files changed

+105
-107
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+105
-107
lines changed

articles/cognitive-services/Speech-Service/breadcrumb/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,6 @@
1717
tocHref: /legal/cognitive-services/speech-service # Destination doc set route
1818
topicHref: /azure/cognitive-services/index # Original doc set route
1919
items:
20-
- name: Speech Service # Destination doc set name
20+
- name: Speech service # Destination doc set name
2121
tocHref: /legal/cognitive-services/speech-service # Destination doc set route
2222
topicHref: /azure/cognitive-services/speech-service/index # Original doc set route

articles/cognitive-services/Speech-Service/call-center-telephony-integration.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,19 +16,19 @@ ms.custom: template-concept
1616

1717
To support real-time scenarios, like Virtual Agent and Agent Assist in Call Centers, an integration with the Call Centers telephony system is required.
1818

19-
Typically, the integration with Microsoft Speech Services is handled by a telephony client connected to the customers SIP/RTP processor, for example, to a Session Border Controller (SBC).
19+
Typically, integration with the Speech service is handled by a telephony client connected to the customers SIP/RTP processor, for example, to a Session Border Controller (SBC).
2020

2121
Usually the telephony client handles the incoming audio stream from the SIP/RTP processor, the conversion to PCM and connects the streams using continuous recognition. It also triages the processing of the results, for example, analysis of speech transcripts for Agent Assist or connect with a dialog processing engine (for example, Azure Botframework or Power Virtual Agent) for Virtual Agent.
2222

23-
For easier integration the Speech Service also supports “ALAW in WAV container” and “MULAW in WAV container” for audio streaming.
23+
For easier integration the Speech service also supports “ALAW in WAV container” and “MULAW in WAV container” for audio streaming.
2424

2525
To build this integration we recommend using the [Speech SDK](./speech-sdk.md).
2626

2727

2828
> [!TIP]
2929
> For guidance on reducing Text to speech latency check out the **[How to lower speech synthesis latency](./how-to-lower-speech-synthesis-latency.md?pivots=programming-language-csharp)** guide.
3030
>
31-
> In addition, consider implementing a Text to speech cache to store all synthesized audio and playback from the cache in case a string has previously been synthesized.
31+
> In addition, consider implementing a text to speech cache to store all synthesized audio and playback from the cache in case a string has previously been synthesized.
3232
3333
## Next steps
3434

articles/cognitive-services/Speech-Service/custom-commands-encryption-of-data-at-rest.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ By default, your subscription uses Microsoft-managed encryption keys. However, y
3535

3636

3737
> [!IMPORTANT]
38-
> Customer-managed keys are only available resources created after 27 June, 2020. To use CMK with Speech Services, you will need to create a new Speech resource. Once the resource is created, you can use Azure Key Vault to set up your managed identity.
38+
> Customer-managed keys are only available resources created after 27 June, 2020. To use CMK with the Speech service, you will need to create a new Speech resource. Once the resource is created, you can use Azure Key Vault to set up your managed identity.
3939
40-
To request the ability to use customer-managed keys, fill out and submit Customer-Managed Key Request Form. It will take approximately 3-5 business days to hear back on the status of your request. Depending on demand, you may be placed in a queue and approved as space becomes available. Once approved for using CMK with Speech Services, you'll need to create a new Speech resource from the Azure portal.
40+
To request the ability to use customer-managed keys, fill out and submit Customer-Managed Key Request Form. It will take approximately 3-5 business days to hear back on the status of your request. Depending on demand, you may be placed in a queue and approved as space becomes available. Once approved for using CMK with the Speech service, you'll need to create a new Speech resource from the Azure portal.
4141
> [!NOTE]
4242
> **Customer-managed keys (CMK) are supported only for Custom Commands.**
4343
>

articles/cognitive-services/Speech-Service/custom-commands.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Good candidates for Custom Commands have a fixed vocabulary with well-defined se
3131

3232
## Getting started with Custom Commands
3333

34-
Our goal with Custom Commands is to reduce your cognitive load to learn all the different technologies and focus building your voice commanding app. First step for using Custom Commands to <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" target="_blank">create an Azure Speech resource </a>. You can author your Custom Commands app on the Speech Studio and publish it, after which an on-device application can communicate with it using the Speech SDK.
34+
Our goal with Custom Commands is to reduce your cognitive load to learn all the different technologies and focus building your voice commanding app. First step for using Custom Commands to <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" target="_blank">create a Speech resource</a>. You can author your Custom Commands app on the Speech Studio and publish it, after which an on-device application can communicate with it using the Speech SDK.
3535

3636
#### Authoring flow for Custom Commands
3737
![Authoring flow for Custom Commands](media/voice-assistants/custom-commands-flow.png "The Custom Commands authoring flow")

articles/cognitive-services/Speech-Service/custom-speech-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Custom Speech overview - Speech service
33
titleSuffix: Azure Cognitive Services
4-
description: Custom Speech is a set of online tools that allows you to evaluate and improve the Microsoft speech to text accuracy for your applications, tools, and products.
4+
description: Custom Speech is a set of online tools that allows you to evaluate and improve the speech to text accuracy for your applications, tools, and products.
55
services: cognitive-services
66
author: eric-urban
77
manager: nitinme
@@ -30,7 +30,7 @@ With Custom Speech, you can upload your own data, test and train a custom model,
3030
Here's more information about the sequence of steps shown in the previous diagram:
3131

3232
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices" title="Create a Speech resource" target="_blank">Speech resource</a> that you create in the Azure portal. If you will train a custom model with audio data, choose a Speech resource region with dedicated hardware for training audio data. See footnotes in the [regions](regions.md#speech-service) table for more information.
33-
1. [Upload test data](./how-to-custom-speech-upload-data.md). Upload test data to evaluate the Microsoft speech to text offering for your applications, tools, and products.
33+
1. [Upload test data](./how-to-custom-speech-upload-data.md). Upload test data to evaluate the speech to text offering for your applications, tools, and products.
3434
1. [Test recognition quality](how-to-custom-speech-inspect-data.md). Use the [Speech Studio](https://aka.ms/speechstudio/customspeech) to play back uploaded audio and inspect the speech recognition quality of your test data.
3535
1. [Test model quantitatively](how-to-custom-speech-evaluate-data.md). Evaluate and improve the accuracy of the speech to text model. The Speech service provides a quantitative word error rate (WER), which you can use to determine if additional training is required.
3636
1. [Train a model](how-to-custom-speech-train-model.md). Provide written transcripts and related text, along with the corresponding audio data. Testing a model before and after training is optional but recommended.

articles/cognitive-services/Speech-Service/devices-sdk-release-notes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,4 +119,4 @@ The following sections list changes in the most recent releases.
119119

120120
## Speech Devices SDK 0.2.12733: 2018-May release
121121

122-
The first public preview release of the Cognitive Services Speech Devices SDK.
122+
The first public preview release of the Speech Devices SDK.

articles/cognitive-services/Speech-Service/embedded-speech.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ zone_pivot_groups: programming-languages-set-thirteen
1919
Embedded Speech is designed for on-device [speech to text](speech-to-text.md) and [text to speech](text-to-speech.md) scenarios where cloud connectivity is intermittent or unavailable. For example, you can use embedded speech in industrial equipment, a voice enabled air conditioning unit, or a car that might travel out of range. You can also develop hybrid cloud and offline solutions. For scenarios where your devices must be in a secure environment like a bank or government entity, you should first consider [disconnected containers](../containers/disconnected-containers.md).
2020

2121
> [!IMPORTANT]
22-
> Microsoft limits access to embedded speech. You can apply for access through the Azure Cognitive Services [embedded speech limited access review](https://aka.ms/csgate-embedded-speech). For more information, see [Limited access for embedded speech](/legal/cognitive-services/speech-service/embedded-speech/limited-access-embedded-speech?context=/azure/cognitive-services/speech-service/context/context).
22+
> Microsoft limits access to embedded speech. You can apply for access through the Azure Cognitive Services Speech [embedded speech limited access review](https://aka.ms/csgate-embedded-speech). For more information, see [Limited access for embedded speech](/legal/cognitive-services/speech-service/embedded-speech/limited-access-embedded-speech?context=/azure/cognitive-services/speech-service/context/context).
2323
2424
## Platform requirements
2525

articles/cognitive-services/Speech-Service/how-to-configure-openssl-linux.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -55,9 +55,9 @@ export SSL_CERT_FILE=/etc/pki/tls/certs/ca-bundle.crt
5555

5656
## Certificate revocation checks
5757

58-
When the Speech SDK connects to the Speech Service, it checks the Transport Layer Security (TLS/SSL) certificate. The Speech SDK verifies that the certificate reported by the remote endpoint is trusted and hasn't been revoked. This verification provides a layer of protection against attacks involving spoofing and other related vectors. The check is accomplished by retrieving a certificate revocation list (CRL) from a certificate authority (CA) used by Azure. A list of Azure CA download locations for updated TLS CRLs can be found in [this document](../../security/fundamentals/tls-certificate-changes.md).
58+
When the Speech SDK connects to the Speech service, it checks the Transport Layer Security (TLS/SSL) certificate. The Speech SDK verifies that the certificate reported by the remote endpoint is trusted and hasn't been revoked. This verification provides a layer of protection against attacks involving spoofing and other related vectors. The check is accomplished by retrieving a certificate revocation list (CRL) from a certificate authority (CA) used by Azure. A list of Azure CA download locations for updated TLS CRLs can be found in [this document](../../security/fundamentals/tls-certificate-changes.md).
5959

60-
If a destination posing as the Speech Service reports a certificate that's been revoked in a retrieved CRL, the SDK will terminate the connection and report an error via a `Canceled` event. The authenticity of a reported certificate can't be checked without an updated CRL. Therefore, the Speech SDK will also treat a failure to download a CRL from an Azure CA location as an error.
60+
If a destination posing as the Speech service reports a certificate that's been revoked in a retrieved CRL, the SDK will terminate the connection and report an error via a `Canceled` event. The authenticity of a reported certificate can't be checked without an updated CRL. Therefore, the Speech SDK will also treat a failure to download a CRL from an Azure CA location as an error.
6161

6262
> [!WARNING]
6363
> If your solution uses proxy or firewall it should be configured to allow access to all certificate revocation list URLs used by Azure. Note that many of these URLs are outside of `microsoft.com` domain, so allowing access to `*.microsoft.com` is not enough. See [this document](../../security/fundamentals/tls-certificate-changes.md) for details. In exceptional cases you may ignore CRL failures (see [the correspondent section](#bypassing-or-ignoring-crl-failures)), but such configuration is strongly not recommended, especially for production scenarios.
@@ -66,7 +66,7 @@ If a destination posing as the Speech Service reports a certificate that's been
6666

6767
One cause of CRL-related failures is the use of large CRL files. This class of error is typically only applicable to special environments with extended CA chains. Standard public endpoints shouldn't encounter this class of issue.
6868

69-
The default maximum CRL size used by the Speech SDK (10 MB) can be adjusted per config object. The property key for this adjustment is `CONFIG_MAX_CRL_SIZE_KB` and the value, specified as a string, is by default "10000" (10 MB). For example, when creating a `SpeechRecognizer` object (that manages a connection to the Speech Service), you can set this property in its `SpeechConfig`. In the snippet below, the configuration is adjusted to permit a CRL file size up to 15 MB.
69+
The default maximum CRL size used by the Speech SDK (10 MB) can be adjusted per config object. The property key for this adjustment is `CONFIG_MAX_CRL_SIZE_KB` and the value, specified as a string, is by default "10000" (10 MB). For example, when creating a `SpeechRecognizer` object (that manages a connection to the Speech service), you can set this property in its `SpeechConfig`. In the snippet below, the configuration is adjusted to permit a CRL file size up to 15 MB.
7070

7171
::: zone pivot="programming-language-csharp"
7272

@@ -158,7 +158,7 @@ speechConfig.properties.SetPropertyByString("OPENSSL_CONTINUE_ON_CRL_DOWNLOAD_FA
158158

159159
::: zone-end
160160

161-
To turn off certificate revocation checks, set the property `"OPENSSL_DISABLE_CRL_CHECK"` to `"true"`. Then, while connecting to the Speech Service, there will be no attempt to check or download a CRL and no automatic verification of a reported TLS/SSL certificate.
161+
To turn off certificate revocation checks, set the property `"OPENSSL_DISABLE_CRL_CHECK"` to `"true"`. Then, while connecting to the Speech service, there will be no attempt to check or download a CRL and no automatic verification of a reported TLS/SSL certificate.
162162

163163
::: zone pivot="programming-language-csharp"
164164

articles/cognitive-services/Speech-Service/how-to-control-connections.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Service connectivity how-to - Speech SDK
33
titleSuffix: Azure Cognitive Services
4-
description: Learn how to monitor for connection status and manually pre-connect or disconnect from the Speech Service.
4+
description: Learn how to monitor for connection status and manually connect or disconnect from the Speech service.
55
services: cognitive-services
66
author: trrwilson
77
manager: nitinme
@@ -17,11 +17,11 @@ ms.custom: devx-track-csharp, devx-track-extended-java
1717

1818
# How to monitor and control service connections with the Speech SDK
1919

20-
`SpeechRecognizer` and other objects in the Speech SDK automatically connect to the Speech Service when it's appropriate. Sometimes, you may either want additional control over when connections begin and end or want more information about when the Speech SDK establishes or loses its connection. The supporting `Connection` class provides this capability.
20+
`SpeechRecognizer` and other objects in the Speech SDK automatically connect to the Speech service when it's appropriate. Sometimes, you may either want extra control over when connections begin and end or want more information about when the Speech SDK establishes or loses its connection. The supporting `Connection` class provides this capability.
2121

2222
## Retrieve a Connection object
2323

24-
A `Connection` can be obtained from most top-level Speech SDK objects via a static `From...` factory method, e.g. `Connection::FromRecognizer(recognizer)` for `SpeechRecognizer`.
24+
A `Connection` can be obtained from most top-level Speech SDK objects via a static `From...` factory method, for example, `Connection::FromRecognizer(recognizer)` for `SpeechRecognizer`.
2525

2626
::: zone pivot="programming-language-csharp"
2727

@@ -49,7 +49,7 @@ Connection connection = Connection.fromRecognizer(recognizer);
4949

5050
## Monitor for connections and disconnections
5151

52-
A `Connection` raises `Connected` and `Disconnected` events when the corresponding status change happens in the Speech SDK's connection to the Speech Service. You can listen to these events to know the latest connection state.
52+
A `Connection` raises `Connected` and `Disconnected` events when the corresponding status change happens in the Speech SDK's connection to the Speech service. You can listen to these events to know the latest connection state.
5353

5454
::: zone pivot="programming-language-csharp"
5555

@@ -96,17 +96,17 @@ connection.disconnected.addEventListener((s, connectionEventArgs) -> {
9696

9797
## Connect and disconnect
9898

99-
`Connection` has explicit methods to start or end a connection to the Speech Service. Reasons you may want to use these include:
99+
`Connection` has explicit methods to start or end a connection to the Speech service. Reasons you may want to control the connection include:
100100

101-
- "Pre-connecting" to the Speech Service to allow the first interaction to start as quickly as possible
101+
- Preconnecting to the Speech service to allow the first interaction to start as quickly as possible
102102
- Establishing connection at a specific time in your application's logic to gracefully and predictably handle initial connection failures
103103
- Disconnecting to clear an idle connection when you don't expect immediate reconnection but also don't want to destroy the object
104104

105105
Some important notes on the behavior when manually modifying connection state:
106106

107107
- Trying to connect when already connected will do nothing. It will not generate an error. Monitor the `Connected` and `Disconnected` events if you want to know the current state of the connection.
108-
- A failure to connect that originates from a problem that has no involvement with the Speech Service -- such as attempting to do so from an invalid state -- will throw or return an error as appropriate to the programming language. Failures that require network resolution -- such as authentication failures -- will not throw or return an error but instead generate a `Canceled` event on the top-level object the `Connection` was created from.
109-
- Manually disconnecting from the Speech Service during an ongoing interaction will result in a connection error and loss of data for that interaction. Connection errors are surfaced on the appropriate top-level object's `Canceled` event.
108+
- A failure to connect that originates from a problem that has no involvement with the Speech service--such as attempting to do so from an invalid state--will throw or return an error as appropriate to the programming language. Failures that require network resolution--such as authentication failures--will not throw or return an error but instead generate a `Canceled` event on the top-level object the `Connection` was created from.
109+
- Manually disconnecting from the Speech service during an ongoing interaction results in a connection error and loss of data for that interaction. Connection errors are surfaced on the appropriate top-level object's `Canceled` event.
110110

111111
::: zone pivot="programming-language-csharp"
112112

0 commit comments

Comments
 (0)