You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/text-analytics/how-tos/text-analytics-how-to-entity-linking.md
+73-11Lines changed: 73 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ manager: nitinme
9
9
ms.service: cognitive-services
10
10
ms.subservice: text-analytics
11
11
ms.topic: article
12
-
ms.date: 07/30/2019
12
+
ms.date: 10/21/2019
13
13
ms.author: aahi
14
14
---
15
15
@@ -25,15 +25,75 @@ The Text Analytics' `entities` endpoint supports both named entity recognition (
25
25
Entity linking is the ability to identify and disambiguate the identity of an entity found in text (for example, determining whether the "Mars" is being used as the planet or as the Roman god of war). This process requires the presence of a knowledge base to which recognized entities are linked - Wikipedia is used as the knowledge base for the `entities` endpoint Text Analytics.
26
26
27
27
### Named Entity Recognition (NER)
28
-
Named entity recognition (NER) is the ability to identify different entities in text and categorize them into pre-defined classes. The supported classes of entities are listed below.
29
-
30
-
In Text Analytics [Version 2.1](https://westcentralus.dev.cognitive.microsoft.com/docs/services/TextAnalytics-v2-1/operations/5ac4251d5b4ccd1554da7634), both entity linking and named entity recognition (NER) are available for several languages. See the [language support](../language-support.md#sentiment-analysis-key-phrase-extraction-and-named-entity-recognition) article for more information.
31
-
32
-
### Language support
33
-
34
-
Using entity linking in various languages requires using a corresponding knowledge base in each language. For entity linking in Text Analytics, this means each language that is supported by the `entities` endpoint will link to the corresponding Wikipedia corpus in that language. Since the size of corpora varies between languages, it is expected that the entity linking functionality's recall will also vary.
35
-
36
-
## Supported Types for Named Entity Recognition
28
+
Named entity recognition (NER) is the ability to identify different entities in text and categorize them into pre-defined classes, or types.
29
+
30
+
## Named Entity Recognition v3 public preview
31
+
32
+
The [next version of Named Entity Recognition](https://cognitiveusw2ppe.portal.azure-api.net/docs/services/TextAnalytics-v3-0-Preview-1/operations/56f30ceeeda5650db055a3c7/console) is now available for public preview. It provides updates to both entity linking and Named Entity Recognition.
33
+
34
+
:::row:::
35
+
:::column span="":::
36
+
**Feature**
37
+
:::column-end:::
38
+
::: column span="":::
39
+
**Description**
40
+
:::column-end:::
41
+
:::row-end:::
42
+
<!-- expanded types and subtypes row-->
43
+
:::row:::
44
+
:::column span="":::
45
+
Expanded entity types and subtypes
46
+
:::column-end:::
47
+
:::column span="":::
48
+
Expanded classification and detection for several named entity types.
49
+
:::column-end:::
50
+
:::row-end:::
51
+
<!-- separate endpoints row-->
52
+
:::row:::
53
+
:::column span="":::
54
+
Separate request endpoints
55
+
:::column-end:::
56
+
:::column span="":::
57
+
Separate endpoints for sending entity linking and NER requests.
58
+
:::column-end:::
59
+
:::row-end:::
60
+
<!-- model-version row -->
61
+
:::row:::
62
+
:::column span="":::
63
+
`model-version` parameter
64
+
:::column-end:::
65
+
:::column span="":::
66
+
An optional parameter for choosing a version of the Text Analytics model. Currently only the default model is available for use.
67
+
:::column-end:::
68
+
:::row-end:::
69
+
70
+
### Entity types
71
+
72
+
Named Entity Recognition v3 provides expanded detection across multiple types. Currently, NER v3 can recognize the following categories of entities. For a detailed list of supported entities and languages, see the [Named entity types](../named-entity-types.md) article.
73
+
74
+
* General
75
+
* Personal Information
76
+
77
+
### Request endpoints
78
+
79
+
Named Entity Recognition v3 uses separate endpoints for NER and entity linking requests. Use a URL format below based on your request:
80
+
81
+
NER
82
+
* General entities - `https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v3.0-preview.1/entities/recognition/general`
83
+
84
+
* Personal information entities - `https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v3.0-preview.1/entities/recognition/pii`
## Supported Types for Named Entity Recognition v2
94
+
95
+
> [!NOTE]
96
+
> The following entities are supported by Named Entity Recognition(NER) version 2. [NER v3](#named-entity-recognition-v3-public-preview) is in public preview, and greatly expands the number and depth of the entities recognized in text.
37
97
38
98
| Type | SubType | Example |
39
99
|:----------- |:------------- |:---------|
@@ -57,9 +117,11 @@ Using entity linking in various languages requires using a corresponding knowled
\* Depending on the input and extracted entities, certain entities may omit the `SubType`. All the supported entity types listed are available only for the English, Chinese-Simplified, French, German and Spanish languages.
120
+
\* Depending on the input and extracted entities, certain entities may omit the `SubType`. All the supported entity types listed are available only for the English, Chinese-Simplified, French, German, and Spanish languages.
61
121
122
+
### Language support
62
123
124
+
Using entity linking in various languages requires using a corresponding knowledge base in each language. For entity linking in Text Analytics, this means each language that is supported by the `entities` endpoint will link to the corresponding Wikipedia corpus in that language. Since the size of corpora varies between languages, it is expected that the entity linking functionality's recall will also vary. See the [language support](../language-support.md#sentiment-analysis-key-phrase-extraction-and-named-entity-recognition) article for more information.
@@ -29,104 +29,6 @@ Text Analytics uses a machine learning classification algorithm to generate a se
29
29
30
30
Sentiment analysis is performed on the entire document, as opposed to extracting sentiment for a particular entity in the text. In practice, there's a tendency for scoring accuracy to improve when documents contain one or two sentences rather than a large block of text. During an objectivity assessment phase, the model determines whether a document as a whole is objective or contains sentiment. A document that's mostly objective doesn't progress to the sentiment detection phase, which results in a 0.50 score, with no further processing. For documents that continue in the pipeline, the next phase generates a score above or below 0.50. The score depends on the degree of sentiment detected in the document.
31
31
32
-
## Preparation
33
-
34
-
Sentiment analysis produces a higher-quality result when you give it smaller chunks of text to work on. This is opposite from key phrase extraction, which performs better on larger blocks of text. To get the best results from both operations, consider restructuring the inputs accordingly.
35
-
36
-
You must have JSON documents in this format: ID, text, and language.
37
-
38
-
Document size must be under 5,120 characters per document. You can have up to 1,000 items (IDs) per collection. The collection is submitted in the body of the request. The following sample is an example of content you might submit for sentiment analysis:
39
-
40
-
```json
41
-
{
42
-
"documents": [
43
-
{
44
-
"language": "en",
45
-
"id": "1",
46
-
"text": "We love this trail and make the trip every year. The views are breathtaking and well worth the hike!"
47
-
},
48
-
{
49
-
"language": "en",
50
-
"id": "2",
51
-
"text": "Poorly marked trails! I thought we were goners. Worst hike ever."
52
-
},
53
-
{
54
-
"language": "en",
55
-
"id": "3",
56
-
"text": "Everyone in my family liked the trail but thought it was too challenging for the less athletic among us. Not necessarily recommended for small children."
57
-
},
58
-
{
59
-
"language": "en",
60
-
"id": "4",
61
-
"text": "It was foggy so we missed the spectacular views, but the trail was ok. Worth checking out if you are in the area."
62
-
},
63
-
{
64
-
"language": "en",
65
-
"id": "5",
66
-
"text": "This is my favorite trail. It has beautiful views and many places to stop and rest"
67
-
}
68
-
]
69
-
}
70
-
```
71
-
72
-
## Step 1: Structure the request
73
-
74
-
For more information on request definition, see [Call the Text Analytics API](text-analytics-how-to-call-api.md). The following points are restated for convenience:
75
-
76
-
+ Create a POST request. To review the API documentation for this request, see the [Sentiment Analysis API](https://westcentralus.dev.cognitive.microsoft.com/docs/services/TextAnalytics-v2-1/operations/56f30ceeeda5650db055a3c9).
77
-
78
-
+ Set the HTTP endpoint for sentiment analysis by using either a Text Analytics resource on Azure or an instantiated [Text Analytics container](text-analytics-how-to-install-containers.md). You must include `/text/analytics/v2.1/sentiment` in the URL. For example: `https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v2.1/sentiment`.
79
-
80
-
+ Set a request header to include the [access key](../../cognitive-services-apis-create-account.md#get-the-keys-for-your-resource) for Text Analytics operations.
81
-
82
-
+ In the request body, provide the JSON documents collection you prepared for this analysis.
83
-
84
-
> [!Tip]
85
-
> Use [Postman](text-analytics-how-to-call-api.md) or open the **API testing console** in the [documentation](https://westcentralus.dev.cognitive.microsoft.com/docs/services/TextAnalytics-v2-1/operations/56f30ceeeda5650db055a3c9) to structure the request and post it to the service.
86
-
87
-
## Step 2: Post the request
88
-
89
-
Analysis is performed upon receipt of the request. For information on the size and number of requests you can send per minute and second, see the [data limits](../overview.md#data-limits) section in the overview.
90
-
91
-
Recall that the service is stateless. No data is stored in your account. Results are returned immediately in the response.
92
-
93
-
94
-
## Step 3: View the results
95
-
96
-
The sentiment analyzer classifies text as predominantly positive or negative. It assigns a score in the range of 0 to 1. Values close to 0.5 are neutral or indeterminate. A score of 0.5 indicates neutrality. When a string can't be analyzed for sentiment or has no sentiment, the score is always 0.5 exactly. For example, if you pass in a Spanish string with an English language code, the score is 0.5.
97
-
98
-
Output is returned immediately. You can stream the results to an application that accepts JSON or save the output to a file on the local system. Then, import the output into an application that you can use to sort, search, and manipulate the data.
99
-
100
-
The following example shows the response for the document collection in this article:
101
-
102
-
```json
103
-
{
104
-
"documents": [
105
-
{
106
-
"score": 0.9999237060546875,
107
-
"id": "1"
108
-
},
109
-
{
110
-
"score": 0.0000540316104888916,
111
-
"id": "2"
112
-
},
113
-
{
114
-
"score": 0.99990355968475342,
115
-
"id": "3"
116
-
},
117
-
{
118
-
"score": 0.980544924736023,
119
-
"id": "4"
120
-
},
121
-
{
122
-
"score": 0.99996328353881836,
123
-
"id": "5"
124
-
}
125
-
],
126
-
"errors": []
127
-
}
128
-
```
129
-
130
32
## Sentiment Analysis v3 public preview
131
33
132
34
The [next version of Sentiment Analysis](https://cognitiveusw2ppe.portal.azure-api.net/docs/services/TextAnalytics-v3-0-Preview-1/operations/56f30ceeeda5650db055a3c9) is now available for public preview. It provides significant improvements in the accuracy and detail of the API's text categorization and scoring.
@@ -158,20 +60,10 @@ Sentiment Analysis v3 can return scores and labels at a sentence and document le
158
60
159
61
### Model versioning
160
62
161
-
Starting in version 3.0, the Text Analytics API lets you choose the Text Analytics model used on your data. Use the optional `model-version` parameter to select a version of the model in your requests. If this parameter isn't specified the API will default to `latest`, the latest stable model version.
162
-
163
-
Available model versions:
164
-
*`2019-10-01` (`latest`)
63
+
> [!NOTE]
64
+
> Model versioning for sentiment analysis is available starting in version `v3.0-preview.1`.
165
65
166
-
Each response from the v3 endpoints includes a `model-version` field specifying the model version that was used.
@@ -272,6 +164,104 @@ While the request format is the same as the previous version, the response forma
272
164
273
165
You can find an example C# application that calls this version of Sentiment Analysis on [GitHub](https://github.com/Azure-Samples/cognitive-services-REST-api-samples/tree/master/dotnet/Language/SentimentV3.cs).
274
166
167
+
## Preparation
168
+
169
+
Sentiment analysis produces a higher-quality result when you give it smaller chunks of text to work on. This is opposite from key phrase extraction, which performs better on larger blocks of text. To get the best results from both operations, consider restructuring the inputs accordingly.
170
+
171
+
You must have JSON documents in this format: ID, text, and language.
172
+
173
+
Document size must be under 5,120 characters per document. You can have up to 1,000 items (IDs) per collection. The collection is submitted in the body of the request. The following sample is an example of content you might submit for sentiment analysis:
174
+
175
+
```json
176
+
{
177
+
"documents": [
178
+
{
179
+
"language": "en",
180
+
"id": "1",
181
+
"text": "We love this trail and make the trip every year. The views are breathtaking and well worth the hike!"
182
+
},
183
+
{
184
+
"language": "en",
185
+
"id": "2",
186
+
"text": "Poorly marked trails! I thought we were goners. Worst hike ever."
187
+
},
188
+
{
189
+
"language": "en",
190
+
"id": "3",
191
+
"text": "Everyone in my family liked the trail but thought it was too challenging for the less athletic among us. Not necessarily recommended for small children."
192
+
},
193
+
{
194
+
"language": "en",
195
+
"id": "4",
196
+
"text": "It was foggy so we missed the spectacular views, but the trail was ok. Worth checking out if you are in the area."
197
+
},
198
+
{
199
+
"language": "en",
200
+
"id": "5",
201
+
"text": "This is my favorite trail. It has beautiful views and many places to stop and rest"
202
+
}
203
+
]
204
+
}
205
+
```
206
+
207
+
## Step 1: Structure the request
208
+
209
+
For more information on request definition, see [Call the Text Analytics API](text-analytics-how-to-call-api.md). The following points are restated for convenience:
210
+
211
+
+ Create a POST request. To review the API documentation for this request, see the [Sentiment Analysis API](https://westcentralus.dev.cognitive.microsoft.com/docs/services/TextAnalytics-v2-1/operations/56f30ceeeda5650db055a3c9).
212
+
213
+
+ Set the HTTP endpoint for sentiment analysis by using either a Text Analytics resource on Azure or an instantiated [Text Analytics container](text-analytics-how-to-install-containers.md). You must include `/text/analytics/v2.1/sentiment` in the URL. For example: `https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v2.1/sentiment`.
214
+
215
+
+ Set a request header to include the [access key](../../cognitive-services-apis-create-account.md#get-the-keys-for-your-resource) for Text Analytics operations.
216
+
217
+
+ In the request body, provide the JSON documents collection you prepared for this analysis.
218
+
219
+
> [!Tip]
220
+
> Use [Postman](text-analytics-how-to-call-api.md) or open the **API testing console** in the [documentation](https://westcentralus.dev.cognitive.microsoft.com/docs/services/TextAnalytics-v2-1/operations/56f30ceeeda5650db055a3c9) to structure the request and post it to the service.
221
+
222
+
## Step 2: Post the request
223
+
224
+
Analysis is performed upon receipt of the request. For information on the size and number of requests you can send per minute and second, see the [data limits](../overview.md#data-limits) section in the overview.
225
+
226
+
Recall that the service is stateless. No data is stored in your account. Results are returned immediately in the response.
227
+
228
+
229
+
## Step 3: View the results
230
+
231
+
The sentiment analyzer classifies text as predominantly positive or negative. It assigns a score in the range of 0 to 1. Values close to 0.5 are neutral or indeterminate. A score of 0.5 indicates neutrality. When a string can't be analyzed for sentiment or has no sentiment, the score is always 0.5 exactly. For example, if you pass in a Spanish string with an English language code, the score is 0.5.
232
+
233
+
Output is returned immediately. You can stream the results to an application that accepts JSON or save the output to a file on the local system. Then, import the output into an application that you can use to sort, search, and manipulate the data.
234
+
235
+
The following example shows the response for the document collection in this article:
236
+
237
+
```json
238
+
{
239
+
"documents": [
240
+
{
241
+
"score": 0.9999237060546875,
242
+
"id": "1"
243
+
},
244
+
{
245
+
"score": 0.0000540316104888916,
246
+
"id": "2"
247
+
},
248
+
{
249
+
"score": 0.99990355968475342,
250
+
"id": "3"
251
+
},
252
+
{
253
+
"score": 0.980544924736023,
254
+
"id": "4"
255
+
},
256
+
{
257
+
"score": 0.99996328353881836,
258
+
"id": "5"
259
+
}
260
+
],
261
+
"errors": []
262
+
}
263
+
```
264
+
275
265
## Summary
276
266
277
267
In this article, you learned concepts and workflow for sentiment analysis by using Text Analytics in Azure Cognitive Services. In summary:
0 commit comments