Merge pull request #198851 from jboback/PII

GitHubber17 · web-flow · commit d5174beeb787 · 2022-05-24T09:50:09.000-07:00
[Cog Svcs] Pii
diff --git a/articles/cognitive-services/language-service/concepts/model-lifecycle.md b/articles/cognitive-services/language-service/concepts/model-lifecycle.md
@@ -55,6 +55,7 @@ Use the table below to find which model versions are supported by each feature:
 | Entity Linking                                      | `2021-06-01`                                                        | `2021-06-01`                       |                        |
 | Named Entity Recognition (NER)                      | `2021-06-01`                                                        | `2021-06-01`                       |                        |
 | Personally Identifiable Information (PII) detection | `2020-07-01`, `2021-01-15`                                          | `2021-01-15`                       |                        |
+| PII detection for conversations (Preview)           | `2022-05-15-preview`                                                |                                    | `2022-05-15-preview`   |                                   
 | Question answering                                  | `2021-10-01`                                                        | `2021-10-01`                       |                        |
 | Text Analytics for health                           | `2021-05-15`, `2022-03-01`                                          | `2022-03-01`                       |                        |
 | Key phrase extraction                               | `2021-06-01`                                                        | `2021-06-01`                       |                        |
diff --git a/articles/cognitive-services/language-service/personally-identifiable-information/concepts/conversations-entity-categories.md b/articles/cognitive-services/language-service/personally-identifiable-information/concepts/conversations-entity-categories.md
@@ -191,7 +191,7 @@ This category contains the following entity:
 
         Any credit card number, any security code on the back, or the expiration date is considered as PII.
 
-        To get this entity category, add `CreditCardNumber` to the `pii-categories` parameter. `CreditCardNumber` will be returned in the API response if detected.
+        To get this entity category, add `CreditCard` to the `pii-categories` parameter. `CreditCard` will be returned in the API response if detected.
 
     :::column-end:::
     :::column span="2":::
@@ -202,27 +202,6 @@ This category contains the following entity:
    :::column-end:::
 :::row-end:::
 
-## Government and country/region-specific identification
+## Next steps
 
-### United States
-
-:::row:::
-    :::column span="":::
-        **Entity**
-
-        U.S. Social Security Number (SSN)
-
-    :::column-end:::
-    :::column span="2":::
-        **Details**
-
-        To get this entity category, add `USSocialSecurityNumber` to the `pii-categories` parameter. `USSocialSecurityNumber` will be returned in the API response if detected.
-      
-    :::column-end:::
-    :::column span="":::
-      **Supported document languages**
-
-      `en`
-      
-   :::column-end:::
-:::row-end:::
+[How to detect PII in conversations](../how-to-call-for-conversations.md)
diff --git a/articles/cognitive-services/language-service/personally-identifiable-information/how-to-call-for-conversations.md b/articles/cognitive-services/language-service/personally-identifiable-information/how-to-call-for-conversations.md
@@ -9,8 +9,8 @@ ms.service: cognitive-services
 ms.subservice: language-service
 ms.topic: how-to
 ms.date: 05/10/2022
-ms.author: bidishac
-ms.custom:
+ms.author: aahi
+ms.reviewer: bidishac
 ---
 
 
@@ -25,9 +25,13 @@ For transcripts, the API also enables redaction of audio segments, which contain
 
 By default, this feature will use the latest available AI model on your input. You can also configure your API requests to use a specific [model version](../concepts/model-lifecycle.md).
 
-### Input languages
+### Language support
 
-Currently the conversational PII preview API only supports English language and is available in the following three regions East US, North Europe and UK south.
+Currently the conversational PII preview API only supports English language.
+
+### Region support
+
+Currently the conversational PII preview API supports the following regions: East US, North Europe and UK south.
 
 ## Submitting data
 
@@ -41,10 +45,243 @@ The API will attempt to detect all the [defined entity categories](concepts/conv
 
 For spoken transcripts, the entities detected will be returned on the `redactionSource` parameter value provided. Currently, the supported values for `redactionSource` are `text`, `lexical`, `itn`, and `maskedItn` (which maps to Microsoft Speech to Text API's `display`\\`displayText`, `lexical`, `itn` and `maskedItn` format respectively). Additionally, for the spoken transcript input, this API will also provide audio timing information to empower audio redaction. For using the audioRedaction feature, use the optional `includeAudioRedaction` flag with `true` value. The audio redaction is performed based on the lexical input format.
 
+
 ## Getting PII results
 
 When you get results from PII detection, you can stream the results to an application or save the output to a file on the local system. The API response will include [recognized entities](concepts/conversations-entity-categories.md), including their categories and subcategories, and confidence scores. The text string with the PII entities redacted will also be returned.
 
+## Examples
+
+# [Client libraries (Azure SDK)](#tab/client-libraries)
+
+1. Go to your resource overview page in the [Azure portal](https://portal.azure.com/#home)
+
+2. From the menu on the left side, select **Keys and Endpoint**. You will need one of the keys and the endpoint to authenticate your API requests.
+
+3. Download and install the client library package for your language of choice:
+    
+    |Language  |Package version  |
+    |---------|---------|
+    |.NET     | [5.2.0-beta.2](https://www.nuget.org/packages/Azure.AI.TextAnalytics/5.2.0-beta.2)        |
+    |Python     | [5.2.0b2](https://pypi.org/project/azure-ai-textanalytics/5.2.0b2/)         |
+    
+4. After you've installed the client library, use the following samples on GitHub to start calling the API.
+    
+    * [C#](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample9_RecognizeCustomEntities.md)
+    * [Java](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/textanalytics/azure-ai-textanalytics/src/samples/java/com/azure/ai/textanalytics/lro/RecognizeCustomEntities.java)
+    * [JavaScript](https://github.com/Azure/azure-sdk-for-js/blob/main/sdk/textanalytics/ai-text-analytics/samples/v5/javascript/customText.js)
+    * [Python](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/sample_recognize_custom_entities.py)
+    
+5. See the following reference documentation for more information on the client, and return object:
+    
+    * [C#](/dotnet/api/azure.ai.textanalytics?view=azure-dotnet-preview&preserve-view=true)
+    * [Java](/java/api/overview/azure/ai-textanalytics-readme?view=azure-java-preview&preserve-view=true)
+    * [JavaScript](/javascript/api/overview/azure/ai-text-analytics-readme?view=azure-node-preview&preserve-view=true)
+    * [Python](/python/api/azure-ai-textanalytics/azure.ai.textanalytics?view=azure-python-preview&preserve-view=true)
+    
+# [REST API](#tab/rest-api)
+
+## Submit transcripts using speech-to-text
+
+Use the following example if you have conversations transcribed using the Speech service's [speech-to-text](../../Speech-Service/speech-to-text.md) feature:
+
+```bash
+curl -i -X POST https://your-language-endpoint-here/language/analyze-conversations?api-version=2022-05-15-preview \
+-H "Content-Type: application/json" \
+-H "Ocp-Apim-Subscription-Key: your-key-here" \
+-d \
+' 
+{
+    "displayName": "Analyze conversations from xxx",
+    "analysisInput": {
+        "conversations": [
+            {
+                "id": "23611680-c4eb-4705-adef-4aa1c17507b5",
+                "language": "en",
+                "modality": "transcript",
+                "conversationItems": [
+                    {
+                        "participantId": "agent_1",
+                        "id": "8074caf7-97e8-4492-ace3-d284821adacd",
+                        "text": "Good morning.",
+                        "lexical": "good morning",
+                        "itn": "good morning",
+                        "maskedItn": "good morning",
+                        "audioTimings": [
+                            {
+                                "word": "good",
+                                "offset": 11700000,
+                                "duration": 2100000
+                            },
+                            {
+                                "word": "morning",
+                                "offset": 13900000,
+                                "duration": 3100000
+                            }
+                        ]
+                    },
+                    {
+                        "participantId": "agent_1",
+                        "id": "0d67d52b-693f-4e34-9881-754a14eec887",
+                        "text": "Can I have your name?",
+                        "lexical": "can i have your name",
+                        "itn": "can i have your name",
+                        "maskedItn": "can i have your name",
+                        "audioTimings": [
+                            {
+                                "word": "can",
+                                "offset": 44200000,
+                                "duration": 2200000
+                            },
+                            {
+                                "word": "i",
+                                "offset": 46500000,
+                                "duration": 800000
+                            },
+                            {
+                                "word": "have",
+                                "offset": 47400000,
+                                "duration": 1500000
+                            },
+                            {
+                                "word": "your",
+                                "offset": 49000000,
+                                "duration": 1500000
+                            },
+                            {
+                                "word": "name",
+                                "offset": 50600000,
+                                "duration": 2100000
+                            }
+                        ]
+                    },
+                    {
+                        "participantId": "customer_1",
+                        "id": "08684a7a-5433-4658-a3f1-c6114fcfed51",
+                        "text": "Sure that is John Doe.",
+                        "lexical": "sure that is john doe",
+                        "itn": "sure that is john doe",
+                        "maskedItn": "sure that is john doe",
+                        "audioTimings": [
+                            {
+                                "word": "sure",
+                                "offset": 5400000,
+                                "duration": 6300000
+                            },
+                            {
+                                "word": "that",
+                                "offset": 13600000,
+                                "duration": 2300000
+                            },
+                            {
+                                "word": "is",
+                                "offset": 16000000,
+                                "duration": 1300000
+                            },
+                            {
+                                "word": "john",
+                                "offset": 17400000,
+                                "duration": 2500000
+                            },
+                            {
+                                "word": "doe",
+                                "offset": 20000000,
+                                "duration": 2700000
+                            }
+                        ]
+                    }
+                ]
+            }
+        ]
+    },
+    "tasks": [
+        {
+            "taskName": "analyze 1",
+            "kind": "ConversationalPIITask",
+            "parameters": {
+                "modelVersion": "2022-05-15-preview",
+                "redactionSource": "text",
+                "includeAudioRedaction": true,
+                "piiCategories": [
+                    "all"
+                ]
+            }
+        }
+    ]
+}
+`
+```
+
+## Submit text chats
+
+Use the following example if you have conversations that originated in text. For example, conversations through a text-based chat client.
+
+```bash
+curl -i -X POST https://your-language-endpoint-here/language/analyze-conversations?api-version=2022-05-15-preview \
+-H "Content-Type: application/json" \
+-H "Ocp-Apim-Subscription-Key: your-key-here" \
+-d \
+' 
+{
+    "displayName": "Analyze conversations from xxx",
+    "analysisInput": {
+        "conversations": [
+            {
+                "id": "23611680-c4eb-4705-adef-4aa1c17507b5",
+                "language": "en",
+                "modality": "text",
+                "conversationItems": [
+                    {
+                        "participantId": "agent_1",
+                        "id": "8074caf7-97e8-4492-ace3-d284821adacd",
+                        "text": "Good morning."
+                    },
+                    {
+                        "participantId": "agent_1",
+                        "id": "0d67d52b-693f-4e34-9881-754a14eec887",
+                        "text": "Can I have your name?"
+                    },
+                    {
+                        "participantId": "customer_1",
+                        "id": "08684a7a-5433-4658-a3f1-c6114fcfed51",
+                        "text": "Sure that is John Doe."
+                    }
+                ]
+            }
+        ]
+    },
+    "tasks": [
+        {
+            "taskName": "analyze 1",
+            "kind": "ConversationalPIITask",
+            "parameters": {
+                "modelVersion": "2022-05-15-preview"
+            }
+        }
+    ]
+}
+`
+```
+
+
+## Get the result
+
+Get the `operation-location` from the response header. The value will look similar to the following URL:
+
+```rest
+https://your-language-endpoint/language/analyze-conversations/jobs/12345678-1234-1234-1234-12345678
+```
+
+To get the results of the request, use the following cURL command. Be sure to replace `my-job-id` with the numerical ID value you received from the previous `operation-location` response header:
+
+```bash
+curl -X GET    https://your-language-endpoint/language/analyze-conversations/jobs/my-job-id \
+-H "Content-Type: application/json" \
+-H "Ocp-Apim-Subscription-Key: your-key-here"
+```
+
+---
+
 ## Service and data limits
 
 [!INCLUDE [service limits article](../includes/service-limits-link.md)]
diff --git a/articles/cognitive-services/language-service/personally-identifiable-information/language-support.md b/articles/cognitive-services/language-service/personally-identifiable-information/language-support.md
@@ -19,7 +19,8 @@ Use this article to learn which natural languages are supported by the PII featu
 
 > [!NOTE]
 > * Languages are added as new [model versions](how-to-call.md#specify-the-pii-detection-model) are released.
-> * The current model version for PII is `2021-01-15`.
+
+# [PII for documents](#tab/documents)
 
 ## PII language support
 
@@ -36,6 +37,16 @@ Use this article to learn which natural languages are supported by the PII featu
 | Portuguese (Portugal) | `pt-PT`       | 2021-01-15                      | `pt` also accepted |
 | Spanish               | `es`          | 2020-04-01                      |                    |
 
+# [PII for conversations (preview)](#tab/conversations)
+
+## PII language support
+
+| Language              | Language code | Starting with v3 model version: | Notes              |
+|:----------------------|:-------------:|:-------------------------------:|:------------------:|
+| English               | `en`          | 2022-05-15-preview              |                    |
+
+---
+
 ## Next steps
 
 [PII feature overview](overview.md)
diff --git a/articles/cognitive-services/language-service/personally-identifiable-information/overview.md b/articles/cognitive-services/language-service/personally-identifiable-information/overview.md
@@ -15,7 +15,7 @@ ms.custom: language-service-pii, ignite-fall-2021
 
 # What is Personally Identifiable Information (PII) detection in Azure Cognitive Service for Language?
 
-PII detection is one of the features offered by [Azure Cognitive Service for Language](../overview.md), a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. The PII detection feature can identify, categorize, and redact sensitive information in unstructured text. For example: phone numbers, email addresses, and forms of identification. 
+PII detection is one of the features offered by [Azure Cognitive Service for Language](../overview.md), a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. The PII detection feature can identify, categorize, and redact sensitive information in unstructured text. For example: phone numbers, email addresses, and forms of identification. The method for utilizing PII in conversations is different than other use cases, and articles for this use have been separated.
 
 * [**Quickstarts**](quickstart.md) are getting-started instructions to guide you through making requests to the service.
 * [**How-to guides**](how-to-call.md) contain instructions for using the service in more specific or customized ways.
diff --git a/articles/cognitive-services/language-service/personally-identifiable-information/quickstart.md b/articles/cognitive-services/language-service/personally-identifiable-information/quickstart.md
@@ -19,6 +19,9 @@ zone_pivot_groups: programming-languages-text-analytics
 
 Use this article to get started detecting and redacting sensitive information in text, using the NER and PII client library and REST API. Follow these steps to try out examples code for mining text:
 
+> [!NOTE]
+> This quickstart only covers PII detection in documents. To learn more about detecting PII in conversations, see [How to detect and redact PII in conversations](how-to-call-for-conversations.md).
+
 ::: zone pivot="programming-language-csharp"
 
 [!INCLUDE [C# quickstart](includes/quickstarts/csharp-sdk.md)]
diff --git a/articles/cognitive-services/language-service/toc.yml b/articles/cognitive-services/language-service/toc.yml
@@ -612,10 +612,14 @@ items:
     items:
     - name: Call PII
       href: personally-identifiable-information/how-to-call.md
+    - name: Call PII for Conversation (preview)
+      href: personally-identifiable-information/how-to-call-for-conversations.md
   - name: Concepts
     items:
     - name: Recognized entity categories
-      href: personally-identifiable-information/concepts/entity-categories.md 
+      href: personally-identifiable-information/concepts/entity-categories.md
+    - name: Recognized entity categories for conversation
+      href: personally-identifiable-information/concepts/conversations-entity-categories.md 
   - name: Reference
     items:
     - name: REST API