You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/language-service/personally-identifiable-information/concepts/conversations-entity-categories.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: Entity categories recognized by Conversational Personally Identifiable Information (detection) in Azure AI Language
3
3
titleSuffix: Azure AI services
4
-
description: Learn about the entities the Conversational PII feature (preview) can recognize from conversation inputs.
4
+
description: Learn about the entities the Conversational PII feature can recognize from conversation inputs.
# What is Personally Identifiable Information (PII) detection in Azure AI Language?
16
16
17
-
PII detection is one of the features offered by [Azure AI Language](../overview.md), a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. The PII detection feature can **identify, categorize, and redact** sensitive information in unstructured text. For example: phone numbers, email addresses, and forms of identification. The method for utilizing PII in conversations is different than other use cases, and articles for this use are separate.
17
+
As of June 2024, we now provide General Availability support for the Conversational PII service (English-language only).
18
+
Customers can now redact transcripts, chats, and other text written in a conversational style (i.e. text with “um”s, “ah”s, multiple speakers, and the spelling out of words for more clarity) with better confidence in AI quality, Azure SLA support and production environment support, and enterprise-grade security in mind.
19
+
20
+
PII detection is one of the features offered by [Azure AI Language](../overview.md), a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. The PII detection feature can **identify, categorize, and redact** sensitive information in unstructured text. For example: phone numbers, email addresses, and forms of identification. Azure AI Language supports general text PII redaction, as well as [Conversational PII](how-to-call-for-conversations.md), a specialized model for handling speech transcriptions and the more informal, conversational tone of meeting and call transcripts. The service also supports [Native Document PII redaction](#native-document-support), where the input and output are structured document files.
18
21
19
22
*[**Quickstarts**](quickstart.md) are getting-started instructions to guide you through making requests to the service.
20
23
*[**How-to guides**](how-to-call.md) contain instructions for using the service in more specific or customized ways.
21
24
* The [**conceptual articles**](concepts/entity-categories.md) provide in-depth explanations of the service's functionality and features.
22
25
23
-
PII comes into two shapes:
24
-
25
-
*[PII](how-to-call.md) - works on unstructured text.
26
-
*[Conversation PII (preview)](how-to-call-for-conversations.md) - tailored model to work on conversation transcription.
27
-
28
26
[!INCLUDE [Typical workflow for pre-configured language features](../includes/overview-typical-workflow.md)]
Copy file name to clipboardExpand all lines: articles/ai-services/language-service/summarization/custom/how-to/data-formats.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ This page contains information about how to select and prepare data in order to
21
21
22
22
## Custom summarization document sample format
23
23
24
-
In the abstractive document summarization scenario, each document (whether it has a provided label or not) is expected to be provided in a plain .txt file. The file contains one or more lines. If multiple lines are provided, each is assumed to be a paragraph of the document. The following is an example document with three paragraphs.
24
+
In the abstractive text summarization scenario, each document (whether it has a provided label or not) is expected to be provided in a plain .txt file. The file contains one or more lines. If multiple lines are provided, each is assumed to be a paragraph of the document. The following is an example document with three paragraphs.
25
25
26
26
*At Microsoft, we have been on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding. As Chief Technology Officer of Azure AI services, I have been working with a team of amazing scientists and engineers to turn this quest into a reality.*
27
27
@@ -66,7 +66,7 @@ In the abstractive document summarization scenario, each document (whether it ha
66
66
67
67
## Sample mapping JSON format
68
68
69
-
In both document and conversation summarization scenarios, a set of documents and corresponding labels can be provided in a single JSON file that references individual document/conversation and summary files.
69
+
In both text and conversation summarization scenarios, a set of documents and corresponding labels can be provided in a single JSON file that references individual document/conversation and summary files.
70
70
71
71
The JSON file is expected to contain the following fields:
72
72
@@ -96,7 +96,7 @@ The JSON file is expected to contain the following fields:
96
96
```
97
97
## Custom document summarization mapping sample
98
98
99
-
The following is an example mapping file for the abstractive document summarization scenario with three documents and corresponding labels.
99
+
The following is an example mapping file for the abstractive text summarization scenario with three documents and corresponding labels.
Copy file name to clipboardExpand all lines: articles/ai-services/language-service/summarization/how-to/conversation-summarization.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,8 +47,8 @@ For easier navigation, here are links to the corresponding sections for each ser
47
47
48
48
The conversation summarization API uses natural language processing techniques to summarize conversations into shorter summaries per request. Conversation summarization can summarize for issues and resolutions discussed in a two-party conversation or summarize a long conversation into chapters and a short narrative for each chapter.
49
49
50
-
There's another feature in Azure AI Language named [document summarization](../overview.md?tabs=document-summarization) that is more suitable to summarize documents into concise summaries. When you're deciding between document summarization and conversation summarization, consider the following points:
51
-
* Input format: Conversation summarization can operate on both chat text and speech transcripts, which have speakers and their utterances. Document summarization operates using simple text, or Word, PDF, or PowerPoint formats.
50
+
There's another feature in Azure AI Language named [text summarization](../overview.md?tabs=text-summarization) that is more suitable to summarize documents into concise summaries. When you're deciding between text summarization and conversation summarization, consider the following points:
51
+
* Input format: Conversation summarization can operate on both chat text and speech transcripts, which have speakers and their utterances. Text summarization operates using simple text, or Word, PDF, or PowerPoint formats.
52
52
* Purpose of summarization: for example, conversation issue and resolution summarization returns a reason and the resolution for a chat between a customer and a customer service agent.
Copy file name to clipboardExpand all lines: articles/ai-services/language-service/summarization/how-to/document-summarization.md
+30-22Lines changed: 30 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,9 +14,9 @@ ms.custom:
14
14
- ignite-2023
15
15
---
16
16
17
-
# How to use document summarization
17
+
# How to use text summarization
18
18
19
-
Document summarization is designed to shorten content that users consider too long to read. Both extractive and abstractive summarization condense articles, papers, or documents to key sentences.
19
+
Text summarization is designed to shorten content that users consider too long to read. Both extractive and abstractive summarization condense articles, papers, or documents to key sentences.
20
20
21
21
**Extractive summarization**: Produces a summary by extracting sentences that collectively represent the most important or relevant information within the original content.
22
22
@@ -32,8 +32,8 @@ For easier navigation, here are links to the corresponding sections for each ser
@@ -61,15 +61,15 @@ You submit documents to the API as strings of text. Analysis is performed upon r
61
61
62
62
When you use this feature, the API results are available for 24 hours from the time the request was ingested, and is indicated in the response. After this time period, the results are purged and are no longer available for retrieval.
63
63
64
-
### Getting document summarization results
64
+
### Getting text summarization results
65
65
66
66
When you get results from language detection, you can stream the results to an application or save the output to a file on the local system.
67
67
68
68
The following is an example of content you might submit for summarization, which is extracted using the Microsoft blog article [A holistic representation toward integrative AI](https://www.microsoft.com/research/blog/a-holistic-representation-toward-integrative-ai/). This article is only an example, the API can accept longer input text. See the data limits section for more information.
69
69
70
70
*"At Microsoft, we have been on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding. As Chief Technology Officer of Azure AI services, I have been working with a team of amazing scientists and engineers to turn this quest into a reality. In my role, I enjoy a unique perspective in viewing the relationship among three attributes of human cognition: monolingual text (X), audio or visual sensory signals, (Y) and multilingual (Z). At the intersection of all three, there’s magic—what we call XYZ-code as illustrated in Figure 1—a joint representation to create more powerful AI that can speak, hear, see, and understand humans better. We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages. The goal is to have pretrained models that can jointly learn representations to support a broad range of downstream AI tasks, much in the way humans do today. Over the past five years, we have achieved human performance on benchmarks in conversational speech recognition, machine translation, conversational question answering, machine reading comprehension, and image captioning. These five breakthroughs provided us with strong signals toward our more ambitious aspiration to produce a leap in AI capabilities, achieving multi-sensory and multilingual learning that is closer in line with how humans learn and understand. I believe the joint XYZ-code is a foundational component of this aspiration, if grounded with external knowledge sources in the downstream AI tasks."*
71
71
72
-
The document summarization API request is processed upon receipt of the request by creating a job for the API backend. If the job succeeded, the output of the API is returned. The output is available for retrieval for 24 hours. After this time, the output is purged. Due to multilingual and emoji support, the response might contain text offsets. See [how to process offsets](../../concepts/multilingual-emoji-support.md) for more information.
72
+
The text summarization API request is processed upon receipt of the request by creating a job for the API backend. If the job succeeded, the output of the API is returned. The output is available for retrieval for 24 hours. After this time, the output is purged. Due to multilingual and emoji support, the response might contain text offsets. See [how to process offsets](../../concepts/multilingual-emoji-support.md) for more information.
73
73
74
74
When you use the above example, the API might return the following summarized sentences:
75
75
@@ -81,9 +81,9 @@ When you use the above example, the API might return the following summarized se
81
81
**Abstractive summarization**:
82
82
- "Microsoft is taking a more holistic, human-centric approach to learning and understanding. We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages. Over the past five years, we have achieved human performance on benchmarks in."
83
83
84
-
### Try document extractive summarization
84
+
### Try text extractive summarization
85
85
86
-
You can use document extractive summarization to get summaries of articles, papers, or documents. To see an example, see the [quickstart article](../quickstart.md).
86
+
You can use text extractive summarization to get summaries of articles, papers, or documents. To see an example, see the [quickstart article](../quickstart.md).
87
87
88
88
You can use the `sentenceCount` parameter to guide how many sentences are returned, with `3` being the default. The range is from 1 to 20.
89
89
@@ -94,20 +94,20 @@ You can also use the `sortby` parameter to specify in what order the extracted s
94
94
|Rank | Order sentences according to their relevance to the input document, as decided by the service. |
95
95
|Offset | Keeps the original order in which the sentences appear in the input document. |
96
96
97
-
### Try document abstractive summarization
97
+
### Try text abstractive summarization
98
98
99
-
The following example gets you started with document abstractive summarization:
99
+
The following example gets you started with text abstractive summarization:
100
100
101
101
1. Copy the command below into a text editor. The BASH example uses the `\` line continuation character. If your console or terminal uses a different line continuation character, use that character instead.
102
102
103
103
```bash
104
-
curl -i -X POST https://<your-language-resource-endpoint>/language/analyze-text/jobs?api-version=2022-10-01-preview \
104
+
curl -i -X POST https://<your-language-resource-endpoint>/language/analyze-text/jobs?api-version=2023-04-01 \
@@ -212,7 +212,7 @@ The following cURL commands are executed from a BASH shell. Edit these commands
212
212
213
213
## Query based summarization
214
214
215
-
The query-based document summarization API is an extension to the existing document summarization API.
215
+
The query-based text summarization API is an extension to the existing text summarization API.
216
216
217
217
The biggest difference is a new `query` field in the request body (under `tasks` > `parameters` > `query`). Additionally, there's a new way to specify the preferred `summaryLength` in "buckets" of short/medium/long, which we recommend using instead of `sentenceCount`, especially when using abstractive. Below is an example request:
218
218
@@ -223,7 +223,7 @@ curl -i -X POST https://<your-language-resource-endpoint>/language/analyze-text/
0 commit comments