Merge pull request #111192 from aahill/speech-update

v-albemi · web-flow · commit cd3e1225d73c · 2020-04-30T11:21:25.000-07:00
[CogSvcs] Speech-to-text container update
diff --git a/articles/cognitive-services/Speech-Service/includes/speech-to-text-chart-config.md b/articles/cognitive-services/Speech-Service/includes/speech-to-text-chart-config.md
@@ -8,7 +8,7 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: speech-service
 ms.topic: include
-ms.date: 08/22/2019
+ms.date: 04/15/2020
 ms.author: trbye
 ---
 
@@ -34,4 +34,31 @@ To override the "umbrella" chart, add the prefix `speechToText.` on any paramete
 | `service.port`|  The port of the **speech-to-text** service. | `80` |
 | `service.annotations` | The **speech-to-text** annotations for the service metadata. Annotations are key value pairs. <br>`annotations:`<br>&nbsp;&nbsp;`some/annotation1: value1`<br>&nbsp;&nbsp;`some/annotation2: value2` | |
 | `service.autoScaler.enabled` | Whether the [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) is enabled. If `true`, the `speech-to-text-autoscaler` will be deployed in the Kubernetes cluster. | `true` |
-| `service.podDisruption.enabled` | Whether the [Pod Disruption Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/) is enabled. If `true`, the `speech-to-text-poddisruptionbudget` will be deployed in the Kubernetes cluster. | `true` |
+| `service.podDisruption.enabled` | Whether the [Pod Disruption Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/) is enabled. If `true`, the `speech-to-text-poddisruptionbudget` will be deployed in the Kubernetes cluster. | `true` |
+
+#### Sentiment analysis (sub-chart: charts/speechToText)
+
+Starting with v2.2.0 of the speech-to-text container, the following parameters are used for sentiment analysis using the Text Analytics API.
+
+|Parameter|Description|Values|Default|
+| --- | --- | --- | --- |
+|`textanalytics.enabled`| Whether the **text-analytics** service is enabled| true/false| `false`|
+|`textanalytics.image.registry`| The **text-analytics** docker image registry| valid docker image registry| |
+|`textanalytics.image.repository`| The **text-analytics** docker image repository| valid docker image repository| |
+|`textanalytics.image.tag`| The **text-analytics** docker image tag| valid docker image tag| |
+|`textanalytics.image.pullSecrets`| The image secrets for pulling **text-analytics** docker image| valid secrets name| |
+|`textanalytics.image.pullByHash`| Specifies if you are pulling docker image by hash.  If `yes`, `image.hash` is required to have as well. If `no`, set it as 'false'. Default is `false`.| true/false| `false`|
+|`textanalytics.image.hash`| The **text-analytics** docker image hash. Only use it with `image.pullByHash:true`.| valid docker image hash | |
+|`textanalytics.image.args.eula`| One of the required arguments by **text-analytics** container, which indicates you've accepted the license. The value of this option must be: `accept`.| `accept`, if you want to use the container | |
+|`textanalytics.image.args.billing`| One of the required arguments by **text-analytics** container, which specifies the billing endpoint URI. The billing endpoint URI value is available on the Azure portal's Speech Overview page.|valid billing endpoint URI||
+|`textanalytics.image.args.apikey`| One of the required arguments by **text-analytics** container, which is used to track billing information.| valid apikey||
+|`textanalytics.cpuRequest`| The requested CPU for **text-analytics** container| int| `3000m`|
+|`textanalytics.cpuLimit`| The limited CPU for **text-analytics** container| | `8000m`|
+|`textanalytics.memoryRequest`| The requested memory for **text-analytics** container| | `3Gi`|
+|`textanalytics.memoryLimit`| The limited memory for **text-analytics** container| | `8Gi`|
+|`textanalytics.service.sentimentURISuffix`| The sentiment analysis URI suffix, the whole URI is in format "http://`<service>`:`<port>`/`<sentimentURISuffix>`". | | `text/analytics/v3.0-preview/sentiment`|
+|`textanalytics.service.type`| The type of **text-analytics** service in Kubernetes. See [Kubernetes service types](https://kubernetes.io/docs/concepts/services-networking/service/) | valid Kubernetes service type | `LoadBalancer` |
+|`textanalytics.service.port`| The port of the **text-analytics** service| int| `50085`|
+|`textanalytics.service.annotations`| The annotations users can add to **text-analytics** service metadata. For instance:<br/> **annotations:**<br/>`   ` **some/annotation1: value1**<br/>`  ` **some/annotation2: value2** | annotations, one per each line| |
+|`textanalytics.serivce.autoScaler.enabled`| Whether [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) is enabled. If enabled, `text-analytics-autoscaler` will be deployed in the Kubernetes cluster | true/false| `true`|
+|`textanalytics.service.podDisruption.enabled`| Whether [Pod Disruption Budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/) is enabled. If enabled, `text-analytics-poddisruptionbudget` will be deployed in the Kubernetes cluster| true/false| `true`|
diff --git a/articles/cognitive-services/Speech-Service/includes/speech-to-text-container-query-endpoint.md b/articles/cognitive-services/Speech-Service/includes/speech-to-text-container-query-endpoint.md
@@ -4,7 +4,7 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: speech-service
 ms.topic: include
-ms.date: 04/01/2020
+ms.date: 04/29/2020
 ms.author: aahi
 ---
 
@@ -31,6 +31,7 @@ to this call using the container [host](https://docs.microsoft.com/dotnet/api/mi
 var config = SpeechConfig.FromHost(
     new Uri("ws://localhost:5000"));
 ```
+
 # [Python](#tab/python)
 
 Change from using this Azure-cloud initialization call:
@@ -40,11 +41,4 @@ speech_config = speechsdk.SpeechConfig(
     subscription=speech_key, region=service_region)
 ```
 
-to this call using the container [host](https://docs.microsoft.com/python/api/azure-cognitiveservices-speech/azure.cognitiveservices.speech.speechconfig?view=azure-python):
-
-```python
-speech_config = speechsdk.SpeechConfig(
-    host="ws://localhost:5000")
-```
-
-***
+---
diff --git a/articles/cognitive-services/Speech-Service/speech-container-howto-on-premises.md b/articles/cognitive-services/Speech-Service/speech-container-howto-on-premises.md
@@ -8,7 +8,7 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: speech-service
 ms.topic: conceptual
-ms.date: 04/01/2020
+ms.date: 04/29/2020
 ms.author: aahi
 ---
 
diff --git a/articles/cognitive-services/Speech-Service/speech-container-howto.md b/articles/cognitive-services/Speech-Service/speech-container-howto.md
@@ -8,7 +8,7 @@ manager: nitinme
 ms.service: cognitive-services
 ms.subservice: speech-service
 ms.topic: conceptual
-ms.date: 04/01/2020
+ms.date: 04/29/2020
 ms.author: aahi
 ---
 
@@ -23,7 +23,7 @@ Speech containers enable customers to build a speech application architecture th
 
 | Function | Features | Latest |
 |--|--|--|
-| Speech-to-text | Transcribes continuous real-time speech or batch audio recordings into text with intermediate results. | 2.1.1 |
+| Speech-to-text | Analyzes sentiment and transcribes continuous real-time speech or batch audio recordings with intermediate results.  | 2.2.0 |
 | Custom Speech-to-text | Using a custom model from the [Custom Speech portal](https://speech.microsoft.com/customspeech), transcribes continuous real-time speech or batch audio recordings into text with intermediate results. | 2.1.1 |
 | Text-to-speech | Converts text to natural-sounding speech with plain text input or Speech Synthesis Markup Language (SSML). | 1.3.0 |
 | Custom Text-to-speech | Using a custom model from the [Custom Voice portal](https://aka.ms/custom-voice-portal), converts text to natural-sounding speech with plain text input or Speech Synthesis Markup Language (SSML). | 1.3.0 |
@@ -159,7 +159,7 @@ All tags, except for `latest` are in the following format and are case-sensitive
 The following tag is an example of the format:
 
 ```
-2.1.1-amd64-en-us-preview
+2.2.0-amd64-en-us-preview
 ```
 
 For all of the supported locales of the **speech-to-text** container, please see [Speech-to-text image tags](../containers/container-image-tags.md#speech-to-text).
@@ -254,6 +254,33 @@ This command:
 * Exposes TCP port 5000 and allocates a pseudo-TTY for the container.
 * Automatically removes the container after it exits. The container image is still available on the host computer.
 
+
+#### Analyze sentiment on the speech-to-text output 
+
+Starting in v2.2.0 of the speech-to-text container, you can call the [sentiment analysis v3 API](../text-analytics/how-tos/text-analytics-how-to-sentiment-analysis.md) on the output. To call sentiment analysis, you will need a Text Analytics API resource endpoint. For example: 
+* `https://westus2.api.cognitive.microsoft.com/text/analytics/v3.0-preview.1/sentiment`
+* `https://localhost:5000/text/analytics/v3.0-preview.1/sentiment`
+
+If you're accessing a Text analytics endpoint in the cloud, you will need a key. If you're running Text Analytics locally, you may not need to provide this.
+
+The key and endpoint are passed to the Speech container as arguments, as in the following example.
+
+```bash
+docker run -it --rm -p 5000:5000 \
+containerpreview.azurecr.io/microsoft/cognitive-services-speech-to-text:latest \
+Eula=accept \
+Billing={ENDPOINT_URI} \
+ApiKey={API_KEY} \
+CloudAI:SentimentAnalysisSettings:TextAnalyticsHost={TEXT_ANALYTICS_HOST} \
+CloudAI:SentimentAnalysisSettings:SentimentAnalysisApiKey={SENTIMENT_APIKEY}
+```
+
+This command:
+
+* Performs the same steps as the command above.
+* Stores a Text Analytics API endpoint and key, for sending sentiment analysis requests. 
+
+
 # [Custom Speech-to-text](#tab/cstt)
 
 The *Custom Speech-to-text* container relies on a custom speech model. The custom model has to have been [trained](how-to-custom-speech-train-model.md) using the [custom speech portal](https://speech.microsoft.com/customspeech).
@@ -375,6 +402,9 @@ This command:
 
 ## Query the container's prediction endpoint
 
+> [!NOTE]
+> Use a unique port number if you're running multiple containers.
+
 | Containers | SDK Host URL | Protocol |
 |--|--|--|
 | Speech-to-text and Custom Speech-to-text | `ws://localhost:5000` | WS |
@@ -384,6 +414,121 @@ For more information on using WSS and HTTPS protocols, see [container security](
 
 [!INCLUDE [Query Speech-to-text container endpoint](includes/speech-to-text-container-query-endpoint.md)]
 
+#### Analyze sentiment
+
+If you provided your Text Analytics API credentials [to the container](#analyze-sentiment-on-the-speech-to-text-output), you can use the Speech SDK to send speech recognition requests with sentiment analysis. You can configure the API responses to use either a *simple* or *detailed* format.
+
+# [Simple format](#tab/simple-format)
+
+To configure the Speech client to use a simple format, add `"Sentiment"` as a value for `Simple.Extensions`. If you want to choose a specific Text Analytics model version, replace `'latest'` in the `speechcontext-phraseDetection.sentimentAnalysis.modelversion` property configuration.
+
+```python
+speech_config.set_service_property(
+    name='speechcontext-PhraseOutput.Simple.Extensions',
+    value='["Sentiment"]',
+    channel=speechsdk.ServicePropertyChannel.UriQueryParameter
+)
+speech_config.set_service_property(
+    name='speechcontext-phraseDetection.sentimentAnalysis.modelversion',
+    value='latest',
+    channel=speechsdk.ServicePropertyChannel.UriQueryParameter
+)
+```
+
+`Simple.Extensions` will return the sentiment result in root layer of the response.
+
+```json
+{
+   "DisplayText":"What's the weather like?",
+   "Duration":13000000,
+   "Id":"6098574b79434bd4849fee7e0a50f22e",
+   "Offset":4700000,
+   "RecognitionStatus":"Success",
+   "Sentiment":{
+      "Negative":0.03,
+      "Neutral":0.79,
+      "Positive":0.18
+   }
+}
+```
+
+# [Detailed format](#tab/detailed-format)
+
+To configure the Speech client to use a detailed format, add `"Sentiment"` as a value for `Detailed.Extensions`, `Detailed.Options`, or both. If you want to choose a specific Text Analytics model version, replace `'latest'` in the `speechcontext-phraseDetection.sentimentAnalysis.modelversion` property configuration.
+
+```python
+speech_config.set_service_property(
+    name='speechcontext-PhraseOutput.Detailed.Options',
+    value='["Sentiment"]',
+    channel=speechsdk.ServicePropertyChannel.UriQueryParameter
+)
+speech_config.set_service_property(
+    name='speechcontext-PhraseOutput.Detailed.Extensions',
+    value='["Sentiment"]',
+    channel=speechsdk.ServicePropertyChannel.UriQueryParameter
+)
+speech_config.set_service_property(
+    name='speechcontext-phraseDetection.sentimentAnalysis.modelversion',
+    value='latest',
+    channel=speechsdk.ServicePropertyChannel.UriQueryParameter
+)
+```
+
+`Detailed.Extensions` provides sentiment result in the root layer of the response. `Detailed.Options` provides the result in `NBest` layer of the response. They can be used separately or together.
+
+```json
+{
+   "DisplayText":"What's the weather like?",
+   "Duration":13000000,
+   "Id":"6a2aac009b9743d8a47794f3e81f7963",
+   "NBest":[
+      {
+         "Confidence":0.973695,
+         "Display":"What's the weather like?",
+         "ITN":"what's the weather like",
+         "Lexical":"what's the weather like",
+         "MaskedITN":"What's the weather like",
+         "Sentiment":{
+            "Negative":0.03,
+            "Neutral":0.79,
+            "Positive":0.18
+         }
+      },
+      {
+         "Confidence":0.9164971,
+         "Display":"What is the weather like?",
+         "ITN":"what is the weather like",
+         "Lexical":"what is the weather like",
+         "MaskedITN":"What is the weather like",
+         "Sentiment":{
+            "Negative":0.02,
+            "Neutral":0.88,
+            "Positive":0.1
+         }
+      }
+   ],
+   "Offset":4700000,
+   "RecognitionStatus":"Success",
+   "Sentiment":{
+      "Negative":0.03,
+      "Neutral":0.79,
+      "Positive":0.18
+   }
+}
+```
+
+---
+
+If you want to completely disable sentiment analysis, add a `false` value to `sentimentanalysis.enabled`.
+
+```python
+speech_config.set_service_property(
+    name='speechcontext-phraseDetection.sentimentanalysis.enabled',
+    value='false',
+    channel=speechsdk.ServicePropertyChannel.UriQueryParameter
+)
+```
+
 ### Text-to-speech or Custom Text-to-speech
 
 [!INCLUDE [Query Text-to-speech container endpoint](includes/text-to-speech-container-query-endpoint.md)]