Skip to content

Commit dcd3a8a

Browse files
committed
conflict fix
2 parents a6df12e + d646f1d commit dcd3a8a

22 files changed

+667
-1
lines changed

articles/cognitive-services/language-service/custom-text-analytics-for-health/concepts/evaluation-metrics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ You can use the Confusion matrix to identify entities that are too close to each
132132

133133
The highlighted diagonal in the image below is the correctly predicted entities, where the predicted tag is the same as the actual tag.
134134

135-
:::image type="content" source="../media/confusion.png" alt-text="A screenshot that shows an example confusion matrix." lightbox="../media/confusion.png":::
135+
:::image type="content" source="../../media/custom/confusion.png" alt-text="A screenshot that shows an example confusion matrix." lightbox="../../media/custom/confusion.png":::
136136

137137
You can calculate the entity-level and model-level evaluation metrics from the confusion matrix:
138138

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: Send a custom Text Analytics for health request to your custom model
3+
description: Learn how to send a request for custom text analytics for health.
4+
titleSuffix: Azure Cognitive Services
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.topic: how-to
11+
ms.date: 06/03/2022
12+
ms.author: aahi
13+
ms.devlang: REST API
14+
ms.custom: language-service-custom-ta4h
15+
---
16+
17+
# Send queries to your custom Text Analytics for health model
18+
19+
After the deployment is added successfully, you can query the deployment to extract entities from your text based on the model you assigned to the deployment.
20+
You can query the deployment programmatically using the [Prediction API](https://aka.ms/ct-runtime-api).
21+
22+
## Test deployed model
23+
24+
You can use Language Studio to submit the custom Text Analytics for health task and visualize the results.
25+
26+
[!INCLUDE [Test model](../includes/language-studio/test-model.md)]
27+
28+
## Send a custom text analytics for health request to your model
29+
30+
# [Language Studio](#tab/language-studio)
31+
32+
[!INCLUDE [Get prediction URL](../../includes/custom/get-prediction-url.md)]
33+
34+
# [REST API](#tab/rest-api)
35+
36+
First you will need to get your resource key and endpoint:
37+
38+
[!INCLUDE [Get keys and endpoint Azure Portal](../includes/get-keys-endpoint-azure.md)]
39+
40+
### Submit a custom Text Analytics for health task
41+
42+
[!INCLUDE [submit a custom Text Analytics for health task using the REST API](../includes/rest-api/submit-task.md)]
43+
44+
### Get task results
45+
46+
[!INCLUDE [get custom Text Analytics for health task results](../includes/rest-api/get-results.md)]
47+
48+
49+
---
50+
51+
## Next steps
52+
53+
* [Custom text analytics for health](../overview.md)
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
---
2+
title: Preparing data and designing a schema for custom Text Analytics for health
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn about how to select and prepare data, to be successful in creating custom TA4H projects.
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.topic: how-to
11+
ms.date: 05/09/2022
12+
ms.author: aahi
13+
ms.custom: language-service-custom-ta4h
14+
---
15+
16+
# How to prepare data and define a schema for custom Text Analytics for health
17+
18+
In order to create a custom TA4H model, you will need quality data to train it. This article covers how you should select and prepare your data, along with defining a schema. Defining the schema is the first step in [project development lifecycle](../overview.md#project-development-lifecycle), and it entailing defining the entity types or categories that you need your model to extract from the text at runtime.
19+
20+
## Schema design
21+
22+
Custom Text Analytics for health allows you to extend and customize the Text Analytics for health entity map. The first step of the process is building your schema, which allows you to define the new entity types or categories that you need your model to extract from text in addition to the Text Analytics for health existing entities at runtime.
23+
24+
* Review documents in your dataset to be familiar with their format and structure.
25+
26+
* Identify the entities you want to extract from the data.
27+
28+
For example, if you are extracting entities from support emails, you might need to extract "Customer name", "Product name", "Request date", and "Contact information".
29+
30+
* Avoid entity types ambiguity.
31+
32+
**Ambiguity** happens when entity types you select are similar to each other. The more ambiguous your schema the more labeled data you will need to differentiate between different entity types.
33+
34+
For example, if you are extracting data from a legal contract, to extract "Name of first party" and "Name of second party" you will need to add more examples to overcome ambiguity since the names of both parties look similar. Avoid ambiguity as it saves time, effort, and yields better results.
35+
36+
* Avoid complex entities. Complex entities can be difficult to pick out precisely from text, consider breaking it down into multiple entities.
37+
38+
For example, extracting "Address" would be challenging if it's not broken down to smaller entities. There are so many variations of how addresses appear, it would take large number of labeled entities to teach the model to extract an address, as a whole, without breaking it down. However, if you replace "Address" with "Street Name", "PO Box", "City", "State" and "Zip", the model will require fewer labels per entity.
39+
40+
41+
## Add entities
42+
43+
To add entities to your project:
44+
45+
1. Move to **Entities** pivot from the top of the page.
46+
47+
2. [Text Analytics for health entities](../../text-analytics-for-health/concepts/health-entity-categories.md) are automatically loaded into your project. To add additional entity categories, select **Add** from the top menu. You will be prompted to type in a name before completing creating the entity.
48+
49+
3. After creating an entity, you'll be routed to the entity details page where you can define the composition settings for this entity.
50+
51+
4. Entities are defined by [entity components](../concepts/entity-components.md): learned, list or prebuilt. Text Analytics for health entities are by default populated with the prebuilt component and cannot have learned components. Your newly defined entities can be populated with the learned component once you add labels for them in your data but cannot be populated with the prebuilt component.
52+
53+
5. You can add a [list](../concepts/entity-components.md#list-component) component to any of your entities.
54+
55+
56+
### Add list component
57+
58+
To add a **list** component, select **Add new list**. You can add multiple lists to each entity.
59+
60+
1. To create a new list, in the *Enter value* text box enter this is the normalized value that will be returned when any of the synonyms values is extracted.
61+
62+
2. For multilingual projects, from the *language* drop-down menu, select the language of the synonyms list and start typing in your synonyms and hit enter after each one. It is recommended to have synonyms lists in multiple languages.
63+
64+
<!--:::image type="content" source="../media/add-list-component.png" alt-text="A screenshot showing a list component in Language Studio." lightbox="../media/add-list-component.png":::-->
65+
66+
### Define entity options
67+
68+
Change to the **Entity options** pivot in the entity details page. When multiple components are defined for an entity, their predictions may overlap. When an overlap occurs, each entity's final prediction is determined based on the [entity option](../concepts/entity-components.md#entity-options) you select in this step. Select the one that you want to apply to this entity and click on the **Save** button at the top.
69+
70+
<!--:::image type="content" source="../media/entity-options.png" alt-text="A screenshot showing an entity option in Language Studio." lightbox="../media/entity-options.png":::-->
71+
72+
73+
After you create your entities, you can come back and edit them. You can **Edit entity components** or **delete** them by selecting this option from the top menu.
74+
75+
76+
## Data selection
77+
78+
The quality of data you train your model with affects model performance greatly.
79+
80+
* Use real-life data that reflects your domain's problem space to effectively train your model. You can use synthetic data to accelerate the initial model training process, but it will likely differ from your real-life data and make your model less effective when used.
81+
82+
* Balance your data distribution as much as possible without deviating far from the distribution in real-life. For example, if you are training your model to extract entities from legal documents that may come in many different formats and languages, you should provide examples that exemplify the diversity as you would expect to see in real life.
83+
84+
* Use diverse data whenever possible to avoid overfitting your model. Less diversity in training data may lead to your model learning spurious correlations that may not exist in real-life data.
85+
86+
* Avoid duplicate documents in your data. Duplicate data has a negative effect on the training process, model metrics, and model performance.
87+
88+
* Consider where your data comes from. If you are collecting data from one person, department, or part of your scenario, you are likely missing diversity that may be important for your model to learn about.
89+
90+
> [!NOTE]
91+
> If your documents are in multiple languages, select the **enable multi-lingual** option during [project creation](../quickstart.md) and set the **language** option to the language of the majority of your documents.
92+
93+
## Data preparation
94+
95+
As a prerequisite for creating a project, your training data needs to be uploaded to a blob container in your storage account. You can create and upload training documents from Azure directly, or through using the Azure Storage Explorer tool. Using the Azure Storage Explorer tool allows you to upload more data quickly.
96+
97+
* [Create and upload documents from Azure](../../../../storage/blobs/storage-quickstart-blobs-portal.md#create-a-container)
98+
* [Create and upload documents using Azure Storage Explorer](../../../../vs-azure-tools-storage-explorer-blobs.md)
99+
100+
You can only use `.txt` documents. If your data is in other format, you can use [CLUtils parse command](https://github.com/microsoft/CognitiveServicesLanguageUtilities/blob/main/CustomTextAnalytics.CLUtils/Solution/CogSLanguageUtilities.ViewLayer.CliCommands/Commands/ParseCommand/README.md) to change your document format.
101+
102+
You can upload an annotated dataset, or you can upload an unannotated one and [label your data](../how-to/label-data.md) in Language studio.
103+
104+
## Test set
105+
106+
When defining the testing set, make sure to include example documents that are not present in the training set. Defining the testing set is an important step to calculate the [model performance](view-model-evaluation.md#model-details). Also, make sure that the testing set includes documents that represent all entities used in your project.
107+
108+
## Next steps
109+
110+
If you haven't already, create a custom Text Analytics for health project. If it's your first time using custom Text Analytics for health, consider following the [quickstart](../quickstart.md) to create an example project. You can also see the [how-to article](../how-to/create-project.md) for more details on what you need to create a project.
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
title: Evaluate a Custom Text Analytics for health model
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to evaluate and score your Custom Text Analytics for health model
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.topic: how-to
11+
ms.date: 02/28/2023
12+
ms.author: aahi
13+
ms.custom: language-service-custom-ta4h
14+
---
15+
16+
17+
# View a custom text analytics for health model's evaluation and details
18+
19+
After your model has finished training, you can view the model performance and see the extracted entities for the documents in the test set.
20+
21+
> [!NOTE]
22+
> Using the **Automatically split the testing set from training data** option may result in different model evaluation result every time you train a new model, as the test set is selected randomly from the data. To make sure that the evaluation is calculated on the same test set every time you train a model, make sure to use the **Use a manual split of training and testing data** option when starting a training job and define your **Test** documents when [labeling data](label-data.md).
23+
24+
## Prerequisites
25+
26+
Before viewing model evaluation, you need:
27+
28+
* A successfully [created project](create-project.md) with a configured Azure blob storage account.
29+
* Text data that [has been uploaded](design-schema.md#data-preparation) to your storage account.
30+
* [Labeled data](label-data.md)
31+
<!--* A [successfully trained model](train-model.md)-->
32+
33+
34+
## Model details
35+
36+
There are several metrics you can use to evaluate your mode. See the [performance metrics](../concepts/evaluation-metrics.md) article for more information on the model details described in this article.
37+
38+
### [Language studio](#tab/language-studio)
39+
40+
[!INCLUDE [View model evaluation using Language Studio](../../includes/custom/model-evaluation-language-studio.md)]
41+
42+
### [REST APIs](#tab/rest-api)
43+
44+
[!INCLUDE [Model evaluation](../includes/rest-api/model-evaluation.md)]
45+
46+
---
47+
48+
## Load or export model data
49+
50+
### [Language studio](#tab/Language-studio)
51+
52+
[!INCLUDE [Load export model](../../includes/custom/load-export-model-language-studio.md)]
53+
54+
55+
### [REST APIs](#tab/REST-APIs)
56+
57+
[!INCLUDE [Load export model](../../includes/custom/load-export-model-rest-api.md)]
58+
59+
---
60+
61+
## Delete model
62+
63+
### [Language studio](#tab/language-studio)
64+
65+
[!INCLUDE [Delete model](../../includes/custom/delete-model-language-studio.md)]
66+
67+
### [REST APIs](#tab/rest-api)
68+
69+
[!INCLUDE [Delete model](../../includes/custom/delete-model-rest-api.md)]
70+
71+
---
72+
73+
## Next steps
74+
75+
<!--* [Deploy your model](deploy-model.md)-->
76+
* Learn about the [metrics used in evaluation](../concepts/evaluation-metrics.md).
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
services: cognitive-services
3+
author: aahill
4+
manager: nitinme
5+
ms.service: cognitive-services
6+
ms.subservice: language-service
7+
ms.topic: include
8+
ms.date: 01/27/2022
9+
ms.author: aahi
10+
ms.custom: ignite-fall-2021, event-tier1-build-2022
11+
---
12+
13+
14+
15+
Submit a **GET** request using the following URL, headers, and JSON body to get trained model evaluation summary.
16+
17+
18+
### Request URL
19+
20+
```rest
21+
{ENDPOINT}/language/authoring/analyze-text/projects/{PROJECT-NAME}/models/{trainedModelLabel}/evaluation/summary-result?api-version={API-VERSION}
22+
```
23+
24+
|Placeholder |Value | Example |
25+
|---------|---------|---------|
26+
|`{ENDPOINT}` | The endpoint for authenticating your API request. | `https://<your-custom-subdomain>.cognitiveservices.azure.com` |
27+
|`{PROJECT-NAME}` | The name for your project. This value is case-sensitive. | `myProject` |
28+
|`{trainedModelLabel}` | The name for your trained model. This value is case-sensitive. | `Model1` |
29+
|`{API-VERSION}` | The version of the API you're calling. See [Model lifecycle](../../../concepts/model-lifecycle.md#choose-the-model-version-used-on-your-data) to learn more about other available API versions. | `2022-05-01` |
30+
31+
32+
### Headers
33+
34+
Use the following header to authenticate your request.
35+
36+
|Key|Value|
37+
|--|--|
38+
|`Ocp-Apim-Subscription-Key`| The key to your resource. Used for authenticating your API requests.|
39+
40+
### Response Body
41+
42+
Once you send the request, you'll get the following response.
43+
44+
```json
45+
{
46+
"projectKind": "CustomHealthcare",
47+
"customEntityRecognitionEvaluation": {
48+
"confusionMatrix": {
49+
"additionalProp1": {
50+
"additionalProp1": {
51+
"normalizedValue": 0,
52+
"rawValue": 0
53+
},
54+
"additionalProp2": {
55+
"normalizedValue": 0,
56+
"rawValue": 0
57+
},
58+
"additionalProp3": {
59+
"normalizedValue": 0,
60+
"rawValue": 0
61+
}
62+
},
63+
"additionalProp2": {
64+
"additionalProp1": {
65+
"normalizedValue": 0,
66+
"rawValue": 0
67+
},
68+
"additionalProp2": {
69+
"normalizedValue": 0,
70+
"rawValue": 0
71+
},
72+
"additionalProp3": {
73+
"normalizedValue": 0,
74+
"rawValue": 0
75+
}
76+
},
77+
"additionalProp3": {
78+
"additionalProp1": {
79+
"normalizedValue": 0,
80+
"rawValue": 0
81+
},
82+
"additionalProp2": {
83+
"normalizedValue": 0,
84+
"rawValue": 0
85+
},
86+
"additionalProp3": {
87+
"normalizedValue": 0,
88+
"rawValue": 0
89+
}
90+
}
91+
},
92+
"entities": {
93+
"additionalProp1": {
94+
"f1": 0,
95+
"precision": 0,
96+
"recall": 0,
97+
"truePositivesCount": 0,
98+
"trueNegativesCount": 0,
99+
"falsePositivesCount": 0,
100+
"falseNegativesCount": 0
101+
},
102+
"additionalProp2": {
103+
"f1": 0,
104+
"precision": 0,
105+
"recall": 0,
106+
"truePositivesCount": 0,
107+
"trueNegativesCount": 0,
108+
"falsePositivesCount": 0,
109+
"falseNegativesCount": 0
110+
},
111+
"additionalProp3": {
112+
"f1": 0,
113+
"precision": 0,
114+
"recall": 0,
115+
"truePositivesCount": 0,
116+
"trueNegativesCount": 0,
117+
"falsePositivesCount": 0,
118+
"falseNegativesCount": 0
119+
}
120+
},
121+
"microF1": 0,
122+
"microPrecision": 0,
123+
"microRecall": 0,
124+
"macroF1": 0,
125+
"macroPrecision": 0,
126+
"macroRecall": 0
127+
},
128+
"evaluationOptions": {
129+
"kind": "percentage",
130+
"trainingSplitPercentage": 0,
131+
"testingSplitPercentage": 0
132+
}
133+
}
134+
135+
```

0 commit comments

Comments
 (0)