Skip to content

Commit d9d5ef5

Browse files
authored
Merge pull request #236293 from aahill/revert-236178-aml-rollback
Adding AML labelling articles
2 parents 102cda0 + b384b8e commit d9d5ef5

File tree

4 files changed

+234
-10
lines changed

4 files changed

+234
-10
lines changed

articles/cognitive-services/.openpublishing.redirection.cognitive-services.json

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,5 @@
11
{
22
"redirections": [
3-
{
4-
"source_path_from_root": "/articles/cognitive-services/language-service/custom-named-entity-recognition/how-to/use-autolabeling.md",
5-
"redirect_url": "/azure/cognitive-services/language-service/custom-named-entity-recognition/tag-data",
6-
"redirect_document_id": false
7-
},
8-
{
9-
"source_path_from_root": "/articles/cognitive-services/language-service/custom-text-classification/how-to/use-autolabeling.md",
10-
"redirect_url": "/azure/cognitive-services/language-service/custom-text-classification/tag-data",
11-
"redirect_document_id": false
12-
},
133
{
144
"source_path_from_root": "/articles/cognitive-services/language-service/custom-text-analytics-for-health/overview.md",
155
"redirect_url": "/azure/cognitive-services/language-service/overview",
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
title: How to use autolabeling in custom named entity recognition
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to use autolabeling in custom named entity recognition.
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.custom: event-tier1-build-2022
11+
ms.topic: how-to
12+
ms.date: 03/20/2023
13+
ms.author: aahi
14+
---
15+
16+
# How to use autolabeling for Custom Named Entity Recognition
17+
18+
[Labeling process](tag-data.md) is an important part of preparing your dataset. Since this process requires both time and effort, you can use the autolabeling feature to automatically label your entities. You can start autolabeling jobs based on a model you've previously trained or using GPT models. With autolabeling based on a model you've previously trained, you can start labeling a few of your documents, train a model, then create an autolabeling job to produce entity labels for other documents based on that model. With autolabeling with GPT, you may immediately trigger an autolabeling job without any prior model training. This feature can save you the time and effort of manually labeling your entities.
19+
20+
## Prerequisites
21+
22+
### [Autolabel based on a model you've trained](#tab/autolabel-model)
23+
24+
Before you can use autolabeling based on a model you've trained, you need:
25+
* A successfully [created project](create-project.md) with a configured Azure blob storage account.
26+
* Text data that [has been uploaded](design-schema.md#data-preparation) to your storage account.
27+
* [Labeled data](tag-data.md)
28+
* A [successfully trained model](train-model.md)
29+
30+
31+
### [Autolabel with GPT](#tab/autolabel-gpt)
32+
Before you can use autolabeling with GPT, you need:
33+
* A successfully [created project](create-project.md) with a configured Azure blob storage account.
34+
* Text data that [has been uploaded](design-schema.md#data-preparation) to your storage account.
35+
* Entity names that are meaningful. The GPT models label entities in your documents based on the name of the entity you've provided.
36+
* [Labeled data](tag-data.md) isn't required.
37+
* An Azure OpenAI [resource and deployment](../../../openai/how-to/create-resource.md).
38+
39+
---
40+
41+
## Trigger an autolabeling job
42+
43+
### [Autolabel based on a model you've trained](#tab/autolabel-model)
44+
45+
When you trigger an autolabeling job based on a model you've trained, there's a monthly limit of 5,000 text records per month, per resource. This means the same limit applies on all projects within the same resource.
46+
47+
> [!TIP]
48+
> A text record is calculated as the ceiling of (Number of characters in a document / 1,000). For example, if a document has 8921 characters, the number of text records is:
49+
>
50+
> `ceil(8921/1000) = ceil(8.921)`, which is 9 text records.
51+
52+
1. From the left navigation menu, select **Data labeling**.
53+
2. Select the **Autolabel** button under the Activity pane to the right of the page.
54+
55+
56+
:::image type="content" source="../media/trigger-autotag.png" alt-text="A screenshot showing how to trigger an autotag job." lightbox="../media/trigger-autotag.png":::
57+
58+
3. Choose Autolabel based on a model you've trained and click on Next.
59+
60+
:::image type="content" source="../media/choose-models.png" alt-text="A screenshot showing model choice for auto labeling." lightbox="../media/choose-models.png":::
61+
62+
4. Choose a trained model. It's recommended to check the model performance before using it for autolabeling.
63+
64+
:::image type="content" source="../media/choose-model-trained.png" alt-text="A screenshot showing how to choose trained model for autotagging." lightbox="../media/choose-model-trained.png":::
65+
66+
5. Choose the entities you want to be included in the autolabeling job. By default, all entities are selected. You can see the total labels, precision and recall of each entity. It's recommended to include entities that perform well to ensure the quality of the automatically labeled entities.
67+
68+
:::image type="content" source="../media/choose-entities.png" alt-text="A screenshot showing which entities to be included in autotag job." lightbox="../media/choose-entities.png":::
69+
70+
6. Choose the documents you want to be automatically labeled. The number of text records of each document is displayed. When you select one or more documents, you should see the number of texts records selected. It's recommended to choose the unlabeled documents from the filter.
71+
72+
> [!NOTE]
73+
> * If an entity was automatically labeled, but has a user defined label, only the user defined label is used and visible.
74+
> * You can view the documents by clicking on the document name.
75+
76+
:::image type="content" source="../media/choose-files.png" alt-text="A screenshot showing which documents to be included in the autotag job." lightbox="../media/choose-files.png":::
77+
78+
7. Select **Autolabel** to trigger the autolabeling job.
79+
You should see the model used, number of documents included in the autolabeling job, number of text records and entities to be automatically labeled. Autolabeling jobs can take anywhere from a few seconds to a few minutes, depending on the number of documents you included.
80+
81+
:::image type="content" source="../media/review-autotag.png" alt-text="A screenshot showing the review screen for an autotag job." lightbox="../media/review-autotag.png":::
82+
83+
### [Autolabel with GPT](#tab/autolabel-gpt)
84+
85+
When you trigger an autolabeling job with GPT, you're charged to your Azure OpenAI resource as per your consumption. You're charged an estimate of the number of tokens in each document being autolabeled. Refer to the [Azure OpenAI pricing page](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for a detailed breakdown of pricing per token of different models.
86+
87+
1. From the left navigation menu, select **Data labeling**.
88+
2. Select the **Autolabel** button under the Activity pane to the right of the page.
89+
90+
:::image type="content" source="../media/trigger-autotag.png" alt-text="A screenshot showing how to trigger an autotag job from the activity pane." lightbox="../media/trigger-autotag.png":::
91+
92+
4. Choose Autolabel with GPT and click on Next.
93+
94+
:::image type="content" source="../media/choose-models.png" alt-text="A screenshot showing model choice for auto labeling." lightbox="../media/choose-models.png":::
95+
96+
5. Choose your Azure OpenAI resource and deployment. You must [create an Azure OpenAI resource and deploy a model](../../../openai/how-to/create-resource.md) in order to proceed.
97+
98+
:::image type="content" source="../media/autotag-choose-open-ai.png" alt-text="A screenshot showing how to choose OpenAI resource and deployments" lightbox="../media/autotag-choose-open-ai.png":::
99+
100+
6. Choose the entities you want to be included in the autolabeling job. By default, all entities are selected. Having descriptive names for labels, and including examples for each label is recommended to achieve good quality labeling with GPT.
101+
102+
:::image type="content" source="../media/choose-entities.png" alt-text="A screenshot showing which entities to be included in autotag job." lightbox="../media/choose-entities.png":::
103+
104+
7. Choose the documents you want to be automatically labeled. It's recommended to choose the unlabeled documents from the filter.
105+
106+
> [!NOTE]
107+
> * If an entity was automatically labeled, but has a user defined label, only the user defined label is used and visible.
108+
> * You can view the documents by clicking on the document name.
109+
110+
:::image type="content" source="../media/choose-files.png" alt-text="A screenshot showing which documents to be included in the autotag job." lightbox="../media/choose-files.png":::
111+
112+
8. Select **Start job** to trigger the autolabeling job.
113+
You should be directed to the autolabeling page displaying the autolabeling jobs initiated. Autolabeling jobs can take anywhere from a few seconds to a few minutes, depending on the number of documents you included.
114+
115+
:::image type="content" source="../media/review-autotag.png" alt-text="A screenshot showing the review screen for an autotag job." lightbox="../media/review-autotag.png":::
116+
117+
118+
---
119+
120+
## Review the auto labeled documents
121+
122+
When the autolabeling job is complete, you can see the output documents in the **Data labeling** page of Language Studio. Select **Review documents with autolabels** to view the documents with the **Auto labeled** filter applied.
123+
124+
:::image type="content" source="../media/open-autotag-files.png" alt-text="A screenshot showing the autolabeled documents" lightbox="../media/open-autotag-files.png":::
125+
126+
Entities that have been automatically labeled appear with a dotted line. These entities have two selectors (a checkmark and an "X") that allow you to accept or reject the automatic label.
127+
128+
Once an entity is accepted, the dotted line changes to a solid one, and the label is included in any further model training becoming a user defined label.
129+
130+
Alternatively, you can accept or reject all automatically labeled entities within the document, using **Accept all** or **Reject all** in the top right corner of the screen.
131+
132+
After you accept or reject the labeled entities, select **Save labels** to apply the changes.
133+
134+
> [!NOTE]
135+
> * We recommend validating automatically labeled entities before accepting them.
136+
> * All labels that were not accepted are be deleted when you train your model.
137+
138+
:::image type="content" source="../media/accept-reject-entities.png" alt-text="A screenshot showing how to accept and reject autolabeled entities." lightbox="../media/accept-reject-entities.png":::
139+
140+
## Next steps
141+
142+
* Learn more about [labeling your data](tag-data.md).
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
---
2+
title: How to use autolabeling in custom text classification
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to use autolabeling in custom text classification.
5+
services: cognitive-services
6+
author: aahill
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: language-service
10+
ms.custom: event-tier1-build-2022
11+
ms.topic: how-to
12+
ms.date: 3/15/2023
13+
ms.author: aahi
14+
---
15+
16+
# How to use autolabeling for Custom Text Classification
17+
18+
[Labeling process](tag-data.md) is an important part of preparing your dataset. Since this process requires much time and effort, you can use the autolabeling feature to automatically label your documents with the classes you want to categorize them into. You can currently start autolabeling jobs based on a model using GPT models where you may immediately trigger an autolabeling job without any prior model training. This feature can save you the time and effort of manually labeling your documents.
19+
20+
## Prerequisites
21+
22+
Before you can use autolabeling with GPT, you need:
23+
* A successfully [created project](create-project.md) with a configured Azure blob storage account.
24+
* Text data that [has been uploaded](design-schema.md#data-preparation) to your storage account.
25+
* Class names that are meaningful. The GPT models label documents based on the names of the classes you've provided.
26+
* [Labeled data](tag-data.md) isn't required.
27+
* An Azure OpenAI [resource and deployment](../../../openai/how-to/create-resource.md).
28+
29+
---
30+
31+
## Trigger an autolabeling job
32+
33+
When you trigger an autolabeling job with GPT, you're charged to your Azure OpenAI resource as per your consumption. You're charged an estimate of the number of tokens in each document being autolabeled. Refer to the [Azure OpenAI pricing page](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for a detailed breakdown of pricing per token of different models.
34+
35+
1. From the left navigation menu, select **Data labeling**.
36+
2. Select the **Autolabel** button under the Activity pane to the right of the page.
37+
38+
:::image type="content" source="../media/trigger-autotag.png" alt-text="A screenshot showing how to trigger an autotag job from the activity pane." lightbox="../media/trigger-autotag.png":::
39+
40+
4. Choose Autolabel with GPT and click on Next.
41+
42+
:::image type="content" source="../media/choose-models.png" alt-text="A screenshot showing model choice for auto labeling." lightbox="../media/choose-models.png":::
43+
44+
5. Choose your Azure OpenAI resource and deployment. You must [create an Azure OpenAI resource and deploy a model](../../../openai/how-to/create-resource.md) in order to proceed.
45+
46+
:::image type="content" source="../media/autotag-choose-open-ai.png" alt-text="A screenshot showing how to choose OpenAI resource and deployments" lightbox="../media/autotag-choose-open-ai.png":::
47+
48+
6. Select the classes you want to be included in the autolabeling job. By default, all classes are selected. Having descriptive names for classes, and including examples for each class is recommended to achieve good quality labeling with GPT.
49+
50+
:::image type="content" source="../media/choose-classes.png" alt-text="A screenshot showing which labels to be included in autotag job." lightbox="../media/choose-classes.png":::
51+
52+
7. Choose the documents you want to be automatically labeled. It's recommended to choose the unlabeled documents from the filter.
53+
54+
> [!NOTE]
55+
> * If a document was automatically labeled, but this label was already user defined, only the user defined label is used.
56+
> * You can view the documents by clicking on the document name.
57+
58+
:::image type="content" source="../media/choose-files.png" alt-text="A screenshot showing which documents to be included in the autotag job." lightbox="../media/choose-files.png":::
59+
60+
8. Select **Start job** to trigger the autolabeling job.
61+
You should be directed to the autolabeling page displaying the autolabeling jobs initiated. Autolabeling jobs can take anywhere from a few seconds to a few minutes, depending on the number of documents you included.
62+
63+
:::image type="content" source="../media/review-autotag.png" alt-text="A screenshot showing the review screen for an autotag job." lightbox="../media/review-autotag.png":::
64+
65+
66+
---
67+
68+
## Review the auto labeled documents
69+
70+
When the autolabeling job is complete, you can see the output documents in the **Data labeling** page of Language Studio. Select **Review documents with autolabels** to view the documents with the **Auto labeled** filter applied.
71+
72+
:::image type="content" source="../media/open-autotag-files.png" alt-text="A screenshot showing the autolabeled documents" lightbox="../media/open-autotag-files.png":::
73+
74+
Documents that have been automatically classified have suggested labels in the activity pane highlighted in purple. Each suggested label has two selectors (a checkmark and a cancel icon) that allow you to accept or reject the automatic label.
75+
76+
Once a label is accepted, the purple color changes to the default blue one, and the label is included in any further model training becoming a user defined label.
77+
78+
After you accept or reject the labels for the autolabeled documents, select **Save labels** to apply the changes.
79+
80+
> [!NOTE]
81+
> * We recommend validating automatically labeled documents before accepting them.
82+
> * All labels that were not accepted are deleted when you train your model.
83+
84+
:::image type="content" source="../media/accept-reject-labels.png" alt-text="A screenshot showing how to accept and reject autolabeled documents." lightbox="../media/accept-reject-labels.png":::
85+
86+
## Next steps
87+
88+
* Learn more about [labeling your data](tag-data.md).

articles/cognitive-services/language-service/toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,8 @@ items:
9898
displayName: Best practices
9999
- name: Label data
100100
href: custom-text-classification/how-to/tag-data.md
101+
- name: Auto label your data (preview)
102+
href: custom-text-classification/how-to/use-autolabeling.md
101103
- name: Label data with Azure Machine Learning
102104
href: custom/azure-machine-learning-labeling.md
103105
- name: Train a model
@@ -194,6 +196,8 @@ items:
194196
href: custom-named-entity-recognition/how-to/design-schema.md
195197
- name: Label data
196198
href: custom-named-entity-recognition/how-to/tag-data.md
199+
- name: Auto label your data (preview)
200+
href: custom-named-entity-recognition/how-to/use-autolabeling.md
197201
- name: Label data with Azure Machine Learning
198202
href: custom/azure-machine-learning-labeling.md
199203
- name: Train a model

0 commit comments

Comments
 (0)