Skip to content

Commit 7c3ebe8

Browse files
Merge pull request #234295 from aahill/ta4h-rest-api
how-to articles
2 parents 2fcebdc + 1fcaed6 commit 7c3ebe8

17 files changed

+542
-5
lines changed
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
---
2+
title: Using Azure resources in custom Text Analytics for health
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn about the steps for using Azure resources with custom text analytics for health.
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.topic: how-to
11+
ms.date: 06/03/2022
12+
ms.author: aahi
13+
ms.custom: language-service-custom-ta4h, references_regions, ignite-fall-2021, event-tier1-build-2022
14+
---
15+
16+
# How to create custom Text Analytics for health project
17+
18+
Use this article to learn how to set up the requirements for starting with custom text analytics for health and create a project.
19+
20+
## Prerequisites
21+
22+
Before you start using custom text analytics for health, you need:
23+
24+
* An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services).
25+
26+
## Create a Language resource
27+
28+
Before you start using custom text analytics for health, you'll need an Azure Language resource. It's recommended to create your Language resource and connect a storage account to it in the Azure portal. Creating a resource in the Azure portal lets you create an Azure storage account at the same time, with all of the required permissions preconfigured. You can also read further in the article to learn how to use a pre-existing resource, and configure it to work with custom named entity recognition.
29+
30+
You also will need an Azure storage account where you will upload your `.txt` documents that will be used to train a model to extract entities.
31+
32+
> [!NOTE]
33+
> * You need to have an **owner** role assigned on the resource group to create a Language resource.
34+
> * If you will connect a pre-existing storage account, you should have an owner role assigned to it.
35+
36+
## Create Language resource and connect storage account
37+
38+
You can create a resource in the following ways:
39+
40+
* The Azure portal
41+
* Language Studio
42+
* PowerShell
43+
44+
> [!Note]
45+
> You shouldn't move the storage account to a different resource group or subscription once it's linked with the Language resource.
46+
47+
[!INCLUDE [create a new resource from the Azure portal](../../includes/custom/resource-creation-azure-portal.md)]
48+
49+
[!INCLUDE [create a new resource from Language Studio](../../includes/custom/resource-creation-language-studio.md)]
50+
51+
[!INCLUDE [create a new resource with Azure PowerShell](../../includes/custom/resource-creation-powershell.md)]
52+
53+
54+
> [!NOTE]
55+
> * The process of connecting a storage account to your Language resource is irreversible, it cannot be disconnected later.
56+
> * You can only connect your language resource to one storage account.
57+
58+
## Using a pre-existing Language resource
59+
60+
[!INCLUDE [use an existing resource](../includes/use-pre-existing-resource.md)]
61+
62+
## Create a custom Text Analytics for health project
63+
64+
Once your resource and storage container are configured, create a new custom text analytics for health project. A project is a work area for building your custom AI models based on your data. Your project can only be accessed by you and others who have access to the Azure resource being used. If you have labeled data, you can use it to get started by [importing a project](#import-project).
65+
66+
### [Language Studio](#tab/language-studio)
67+
68+
[!INCLUDE [Language Studio project creation](../includes/language-studio/create-project.md)]
69+
70+
### [REST APIs](#tab/rest-api)
71+
72+
[!INCLUDE [REST APIs project creation](../includes/rest-api/create-project.md)]
73+
74+
---
75+
76+
## Import project
77+
78+
If you have already labeled data, you can use it to get started with the service. Make sure that your labeled data follows the [accepted data formats](../concepts/data-formats.md).
79+
80+
### [Language Studio](#tab/language-studio)
81+
82+
[!INCLUDE [Import project](../includes/language-studio/import-project.md)]
83+
84+
### [REST APIs](#tab/rest-api)
85+
86+
[!INCLUDE [Import project](../includes/rest-api/import-project.md)]
87+
88+
---
89+
90+
## Get project details
91+
92+
### [Language Studio](#tab/language-studio)
93+
94+
[!INCLUDE [Language Studio project details](../../includes/custom/project-details.md)]
95+
96+
### [REST APIs](#tab/rest-api)
97+
98+
[!INCLUDE [REST APIs project details](../includes/rest-api/project-details.md)]
99+
100+
---
101+
102+
## Delete project
103+
104+
### [Language Studio](#tab/language-studio)
105+
106+
[!INCLUDE [Delete project using Language studio](../includes/language-studio/delete-project.md)]
107+
108+
### [REST APIs](#tab/rest-api)
109+
110+
[!INCLUDE [Delete project using the REST API](../includes/rest-api/delete-project.md)]
111+
112+
---
113+
114+
## Next steps
115+
116+
<!--* You should have an idea of the [project schema](design-schema.md) you will use to label your data.-->
117+
118+
* After you define your schema, you can start [labeling your data](label-data.md), which will be used for model training, evaluation, and finally making predictions.
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
title: How to label your data for custom Text Analytics for health
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to label your data for use with custom Text Analytics for health.
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.topic: how-to
11+
ms.date: 05/24/2022
12+
ms.author: aahi
13+
ms.custom: language-service-custom-ner, ignite-fall-2021, event-tier1-build-2022
14+
---
15+
16+
# Label your data using the Language Studio
17+
18+
Data labeling is a crucial step in development lifecycle. In this step, you label your documents with the new entities you defined in your schema to populate their learned components. This data will be used in the next step when training your model so that your model can learn from the labeled data to know which entities to extract. If you already have labeled data, you can directly [import](create-project.md#import-project) it into your project, but you need to make sure that your data follows the [accepted data format](../concepts/data-formats.md). See [create project](create-project.md#import-project) to learn more about importing labeled data into your project. If your data isn't labeled already, you can label it in the [Language Studio](https://aka.ms/languageStudio).
19+
20+
## Prerequisites
21+
22+
Before you can label your data, you need:
23+
24+
* A successfully [created project](create-project.md) with a configured Azure blob storage account
25+
<!--* Text data that [has been uploaded](design-schema.md#data-preparation) to your storage account.-->
26+
27+
See the [project development lifecycle](../overview.md#project-development-lifecycle) for more information.
28+
29+
## Data labeling guidelines
30+
31+
After preparing your data, designing your schema and creating your project, you will need to label your data. Labeling your data is important so your model knows which words will be associated with the entity types you need to extract. When you label your data in [Language Studio](https://aka.ms/languageStudio) (or import labeled data), these labels are stored in the JSON document in your storage container that you have connected to this project.
32+
33+
As you label your data, keep in mind:
34+
35+
* You can't add labels for Text Analytics for health entities as they're pretrained prebuilt entities. You can only add labels to new entity categories that you defined during schema definition.
36+
37+
<!--If you want to improve the recall for a prebuilt entity, you can extend it by adding a list component while you are [defining your schema](design-schema.md).-->
38+
39+
* In general, more labeled data leads to better results, provided the data is labeled accurately.
40+
41+
* The precision, consistency and completeness of your labeled data are key factors to determining model performance.
42+
43+
* **Label precisely**: Label each entity to its right type always. Only include what you want extracted, avoid unnecessary data in your labels.
44+
* **Label consistently**: The same entity should have the same label across all the documents.
45+
* **Label completely**: Label all the instances of the entity in all your documents.
46+
47+
> [!NOTE]
48+
> There is no fixed number of labels that can guarantee your model will perform the best. Model performance is dependent on possible ambiguity in your schema, and the quality of your labeled data. Nevertheless, we recommend having around 50 labeled instances per entity type.
49+
50+
## Label your data
51+
52+
Use the following steps to label your data:
53+
54+
1. Go to your project page in [Language Studio](https://aka.ms/languageStudio).
55+
56+
2. From the left side menu, select **Data labeling**. You can find a list of all documents in your storage container.
57+
58+
<!--:::image type="content" source="../media/tagging-files-view.png" alt-text="A screenshot showing the Language Studio screen for labeling data." lightbox="../media/tagging-files-view.png":::-->
59+
60+
>[!TIP]
61+
> You can use the filters in top menu to view the unlabeled documents so that you can start labeling them.
62+
> You can also use the filters to view the documents that are labeled with a specific entity type.
63+
64+
3. Change to a single document view from the left side in the top menu or select a specific document to start labeling. You can find a list of all `.txt` documents available in your project to the left. You can use the **Back** and **Next** button from the bottom of the page to navigate through your documents.
65+
66+
> [!NOTE]
67+
> If you enabled multiple languages for your project, you will find a **Language** dropdown in the top menu, which lets you select the language of each document. Hebrew is not supported with multi-lingual projects.
68+
69+
4. In the right side pane, you can use the **Add entity type** button to add additional entities to your project that you missed during schema definition.
70+
71+
<!--:::image type="content" source="../media/tag-1.png" alt-text="A screenshot showing complete data labeling." lightbox="../media/tag-1.png":::-->
72+
73+
5. You have two options to label your document:
74+
75+
|Option |Description |
76+
|---------|---------|
77+
|Label using a brush | Select the brush icon next to an entity type in the right pane, then highlight the text in the document you want to annotate with this entity type. |
78+
|Label using a menu | Highlight the word you want to label as an entity, and a menu will appear. Select the entity type you want to assign for this entity. |
79+
80+
The below screenshot shows labeling using a brush.
81+
82+
:::image type="content" source="../media/tag-options.png" alt-text="A screenshot showing the labeling options offered in Custom NER." lightbox="../media/tag-options.png":::
83+
84+
6. In the right side pane under the **Labels** pivot you can find all the entity types in your project and the count of labeled instances per each. The prebuilt entities will be shown for reference but you will not be able to label for these prebuilt entities as they are pretrained.
85+
86+
7. In the bottom section of the right side pane you can add the current document you are viewing to the training set or the testing set. By default all the documents are added to your training set. <!--Learn more about [training and testing sets](train-model.md#data-splitting) and how they are used for model training and evaluation.-->
87+
88+
> [!TIP]
89+
> If you are planning on using **Automatic** data splitting, use the default option of assigning all the documents into your training set.
90+
91+
7. Under the **Distribution** pivot you can view the distribution across training and testing sets. You have two options for viewing:
92+
* *Total instances* where you can view count of all labeled instances of a specific entity type.
93+
* *Documents with at least one label* where each document is counted if it contains at least one labeled instance of this entity.
94+
95+
7. When you're labeling, your changes are synced periodically, if they have not been saved yet you will find a warning at the top of your page. If you want to save manually, click on **Save labels** button at the bottom of the page.
96+
97+
## Remove labels
98+
99+
To remove a label
100+
101+
1. Select the entity you want to remove a label from.
102+
2. Scroll through the menu that appears, and select **Remove label**.
103+
104+
## Delete entities
105+
106+
You cannot delete any of the Text Analytics for health pretrained entities because they have a prebuilt component. You are only permitted to delete newly defined entity categories. To delete an entity, select the delete icon next to the entity you want to remove. Deleting an entity removes all its labeled instances from your dataset.
107+
108+
## Next steps
109+
110+
[Custom text analytics for health overview](../overview.md)
111+
112+
<!--After you've labeled your data, you can begin [training a model](train-model.md) that will learn based on your data.-->

articles/cognitive-services/language-service/custom-text-analytics-for-health/includes/language-studio/delete-project.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,7 @@ ms.author: aahi
1010
ms.custom: language-service-custom-classification, event-tier1-build-2022
1111
---
1212

13-
When you don't need your project anymore, you can delete your project using [Language Studio](https://aka.ms/custom-extraction). Select **Custom named entity recognition (NER)** from the top, select the project you want to delete, and then select **Delete** from the top menu.
13+
When you don't need your project anymore, you can delete your project using [Language Studio](https://aka.ms/custom-extraction).
14+
1. Select the Language service feature you're using at the top of the page, s
15+
1. Select the project you want to delete
16+
1. Select **Delete** from the top menu.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
titleSuffix: Azure Cognitive Services
3+
services: cognitive-services
4+
author: aahill
5+
manager: nitinme
6+
ms.service: cognitive-services
7+
ms.subservice: language-service
8+
ms.custom: event-tier1-build-2022
9+
ms.topic: include
10+
ms.date: 05/06/2022
11+
ms.author: aahi
12+
---
13+
14+
1. Sign into the [Language Studio](https://aka.ms/languageStudio). A window will appear to let you select your subscription and Language resource. Select your Language resource.
15+
16+
2. Under the **Extract information** section of Language Studio, select **Custom named entity recognition**.
17+
18+
<!--:::image type="content" source="../../media/select-custom-ner.png" alt-text="A screenshot showing the location of the custom NER feature in the Language Studio landing page." lightbox="../../media/select-custom-ner.png":::-->
19+
20+
21+
3. Select **Create new project** from the top menu in your projects page. Creating a project will let you tag data, train, evaluate, improve, and deploy your models.
22+
23+
<!--:::image type="content" source="../../media/create-project.png" alt-text="A screenshot of the project creation page." lightbox="../../media/create-project.png":::-->
24+
25+
26+
4. After you select **Create new project**, a screen will appear to let you connect your storage account. If you can’t find your storage account, make sure you created a resource using the recommended steps. If you've already connected a storage account to your Language resource, you will see your storage account connected.
27+
28+
>[!NOTE]
29+
> * You only need to do this step once for each new language resource you use.
30+
> * This process is irreversible, if you connect a storage account to your Language resource you cannot disconnect it later.
31+
> * You can only connect your Language resource to one storage account.
32+
33+
:::image type="content" source="../../media/connect-storage.png" alt-text="A screenshot of the storage connection screen for new projects." lightbox="../../media/connect-storage.png":::
34+
35+
4. Enter the project information, including a name, description, and the language of the files in your project. You won’t be able to change the name of your project later. Click **Next**.
36+
37+
>[!TIP]
38+
> Your dataset doesn't have to be entirely in the same language. You can have multiple documents, each with different supported languages. If your dataset contains documents of different languages or if you expect text from different languages during runtime, select **enable multi-lingual dataset** option when you enter the basic information for your project. This option can be enabled later from the **Project settings** page.
39+
40+
5. Select the container where you have uploaded your dataset.
41+
42+
7. Click on **Yes, my files are already labeled and I have formatted JSON labels file** and select the labels file from the drop-down menu below to import your JSON labels file. Make sure it follows the [supported format](../../concepts/data-formats.md).
43+
44+
8. Click **Next**.
45+
46+
9. Review the data you entered and select **Create Project**.

articles/cognitive-services/language-service/custom-text-analytics-for-health/includes/rest-api/create-project.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,9 @@ Use the following JSON in your request. Replace the placeholder values below wit
5757
|Key |Placeholder|Value | Example |
5858
|---------|---------|---------|--|
5959
| projectName | `{PROJECT-NAME}` | The name of your project. This value is case-sensitive. | `myProject` |
60-
| language | `{LANGUAGE-CODE}` | A string specifying the language code for the documents used in your project. If your project is a multilingual project, choose the language code of the majority of the documents. See [language support](../../language-support.md) to learn more about supported language codes. |`en-us`|
60+
| language | `{LANGUAGE-CODE}` | A string specifying the language code for the documents used in your project. If your project is a multilingual project, choose the language code of the majority of the documents. <!--See [language support](../../language-support.md) to learn more about supported language codes.--> |`en-us`|
6161
| projectKind | `CustomHealthcare` | Your project kind. | `CustomHealthcare` |
62-
| multilingual | `true`| A boolean value that enables you to have documents in multiple languages in your dataset and when your model is deployed you can query the model in any supported language (not necessarily included in your training documents. See [language support](../../language-support.md) to learn more about multilingual support. | `true`|
62+
| multilingual | `true`| A boolean value that enables you to have documents in multiple languages in your dataset and when your model is deployed you can query the model in any supported language (not necessarily included in your training documents. <!--See [language support](../../language-support.md) to learn more about multilingual support.--> | `true`|
6363
| storageInputContainerName | `{CONTAINER-NAME` | The name of your Azure storage container where you have uploaded your documents. | `myContainer` |
6464

6565

articles/cognitive-services/language-service/custom-text-analytics-for-health/includes/rest-api/project-details.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Use the following header to authenticate your request.
5353
| `projectKind` | `CustomHealthcare` | Your project kind. | `CustomHealthcare` |
5454
| `storageInputContainerName` | `{CONTAINER-NAME}` | The name of your Azure storage container where you have uploaded your documents. | `myContainer` |
5555
| `projectName` | `{PROJECT-NAME}` | The name of your project. This value is case-sensitive. | `myProject` |
56-
| `multilingual` | `true`| A boolean value that enables you to have documents in multiple languages in your dataset and when your model is deployed you can query the model in any supported language (not necessarily included in your training documents. For more information about multilingual support, see [Language support](../../language-support.md). | `true`|
57-
| `language` | `{LANGUAGE-CODE}` | A string specifying the language code for the documents used in your project. If your project is a multilingual project, choose the [language code](../../language-support.md) of the majority of the documents. |`en`|
56+
| `multilingual` | `true`| A boolean value that enables you to have documents in multiple languages in your dataset and when your model is deployed you can query the model in any supported language (not necessarily included in your training documents. <!--For more information about multilingual support, see [Language support](../../language-support.md).--> | `true`|
57+
| `language` | `{LANGUAGE-CODE}` | A string specifying the language code for the documents used in your project. If your project is a multilingual project, choose the language code of the majority of the documents. |`en`|
5858

5959
Once you send your API request, you will receive a `200` response indicating success and JSON response body with your project details.

0 commit comments

Comments
 (0)