Skip to content

Commit e8dadc5

Browse files
authored
Merge pull request #191944 from lgayhardt/azureml-ner-label-0322
Text labeling GA and NER
2 parents b75df3d + 3e6415c commit e8dadc5

File tree

7 files changed

+69
-46
lines changed

7 files changed

+69
-46
lines changed

articles/machine-learning/how-to-create-text-labeling-projects.md

Lines changed: 28 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -7,26 +7,21 @@ ms.author: sgilley
77
ms.service: machine-learning
88
ms.subservice: mldata
99
ms.topic: how-to
10-
ms.date: 10/21/2021
10+
ms.date: 03/18/2022
1111
ms.custom: data4ml, ignite-fall-2021
1212
---
1313

14-
# Create a text labeling project and export labels (preview)
14+
# Create a text labeling project and export labels
1515

1616
Learn how to create and run data labeling projects to label text data in Azure Machine Learning. Specify either a single label or multiple labels to be applied to each text item.
1717

1818
You can also use the data labeling tool to [create an image labeling project](how-to-create-image-labeling-projects.md).
1919

20-
> [!IMPORTANT]
21-
> Text labeling is currently in public preview.
22-
> The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
23-
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
24-
2520
## Text labeling capabilities
2621

2722
Azure Machine Learning data labeling is a central place to create, manage, and monitor data labeling projects:
2823

29-
- Coordinate data, labels, and team members to efficiently manage labeling tasks.
24+
- Coordinate data, labels, and team members to efficiently manage labeling tasks.
3025
- Tracks progress and maintains the queue of incomplete labeling tasks.
3126
- Start and stop the project and control the labeling progress.
3227
- Review the labeled data and export labeled as an Azure Machine Learning dataset.
@@ -47,22 +42,28 @@ Data formats available for text data:
4742

4843
[!INCLUDE [start](../../includes/machine-learning-data-labeling-start.md)]
4944

50-
1. To create a project, select **Add project**. Give the project an appropriate name. The project name cannot be reused, even if the project is deleted in future.
45+
1. To create a project, select **Add project**. Give the project an appropriate name. The project name can’t be reused, even if the project is deleted in future.
5146

5247
1. Select **Text** to create a text labeling project.
5348

54-
:::image type="content" source="media/how-to-create-labeling-projects/text-labeling-creation-wizard.png" alt-text="Labeling project creation for text labeling":::
49+
:::image type="content" source="media/how-to-create-text-labeling-projects/text-labeling-creation-wizard.png" alt-text="Labeling project creation for text labeling":::
50+
51+
* Choose **Text Classification Multi-class** for projects when you want to apply only a *single label* from a set of labels to each piece of text.
52+
* Choose **Text Classification Multi-label** for projects when you want to apply *one or more* labels from a set of labels to each piece of text.
53+
* Choose **Text Named Entity Recognition (Preview)** for projects when you want to apply labels to individual or multiple words of text in each entry.
5554

56-
* Choose **Text Classification Multi-class (Preview)** for projects when you want to apply only a *single label* from a set of labels to each piece of text.
57-
* Choose **Text Classification Multi-label (Preview)** for projects when you want to apply *one or more* labels from a set of labels to each piece of text.
55+
> [!IMPORTANT]
56+
> Text Named Entity Recognition is currently in public preview.
57+
> The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
58+
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
5859
5960
1. Select **Next** when you're ready to continue.
6061

6162
## Add workforce (optional)
6263

6364
[!INCLUDE [outsource](../../includes/machine-learning-data-labeling-outsource.md)]
6465

65-
## Specify the data to label
66+
## Select or create a dataset
6667

6768
If you already created a dataset that contains your data, select it from the **Select an existing dataset** drop-down list. Or, select **Create a dataset** to use an existing Azure datastore or to upload local files.
6869

@@ -78,7 +79,7 @@ To create a dataset from data that you've already stored in Azure Blob storage:
7879
1. Select **Create a dataset** > **From datastore**.
7980
1. Assign a **Name** to your dataset.
8081
1. Choose the **Dataset type**:
81-
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response.
82+
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response. Tabular isn't available for Text Named Entity Recognition projects.
8283
* Select **File** if you're using separate .txt files for each response.
8384
1. (Optional) Provide a description for your dataset.
8485
1. Select **Next**.
@@ -96,7 +97,7 @@ To directly upload your data:
9697
1. Select **Create a dataset** > **From local files**.
9798
1. Assign a **Name** to your dataset.
9899
1. Choose the **Dataset type**.
99-
* Select **Tabular** if you're using a .csv or .tsv file, where each row is a response.
100+
* Select **Tabular** if you're using a .csv or .tsv file, where each row is a response. Tabular isn't available for Text Named Entity Recognition projects.
100101
* Select **File** if you're using separate .txt files for each response.
101102
1. (Optional) Provide a description of your dataset.
102103
1. Select **Next**
@@ -131,7 +132,6 @@ To directly upload your data:
131132
## Use ML-assisted data labeling
132133

133134
The **ML-assisted labeling** page lets you trigger automatic machine learning models to accelerate labeling tasks. ML-assisted labeling is available for both file (.txt) and tabular (.csv) text data inputs.
134-
135135
To use **ML-assisted labeling**:
136136

137137
* Select **Enable ML assisted labeling**.
@@ -144,7 +144,7 @@ At the beginning of your labeling project, the items are shuffled into a random
144144

145145
For training the text DNN model used by ML-assist, the input text per training example will be limited to approximately the first 128 words in the document. For tabular input, all text columns are first concatenated before applying this limit. This is a practical limit imposed to allow for the model training to complete in a timely manner. The actual text in a document (for file input) or set of text columns (for tabular input) can exceed 128 words. The limit only pertains to what is internally leveraged by the model during the training process.
146146

147-
The exact number of labeled items necessary to start assisted labeling is not a fixed number. This can vary significantly from one labeling project to another, depending on many factors, including the number of labels classes and label distribution.
147+
The exact number of labeled items necessary to start assisted labeling isn't a fixed number. This can vary significantly from one labeling project to another, depending on many factors, including the number of labels classes and label distribution.
148148

149149
Since the final labels still rely on input from the labeler, this technology is sometimes called *human in the loop* labeling.
150150

@@ -167,13 +167,12 @@ Once a machine learning model has been trained on your manually labeled data, th
167167

168168
### Dashboard
169169

170-
171170
The **Dashboard** tab shows the progress of the labeling task.
172171

173-
:::image type="content" source="./media/how-to-create-labeling-projects/text-labeling-dashboard.png" alt-text="Text data labeling dashboard":::
172+
:::image type="content" source="./media/how-to-create-text-labeling-projects/text-labeling-dashboard.png" alt-text="Text data labeling dashboard":::
174173

175174

176-
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of item in each section.
175+
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
177176

178177
The middle section shows the queue of tasks yet to be assigned. If ML-assisted labeling is on, you'll also see the number of pre-labeled items.
179178

@@ -189,7 +188,7 @@ On the **Data** tab, you can see your dataset and review labeled data. Scroll th
189188
View and change details of your project. In this tab you can:
190189

191190
* View project details and input datasets
192-
* Enable or disable **incremental refresh at regular intervals** or request an immediate refresh
191+
* Enable or disable **incremental refresh at regular intervals**, or request an immediate refresh.
193192
* View details of the storage container used to store labeled outputs in your project
194193
* Add labels to your project
195194
* Edit instructions you give to your labels
@@ -206,11 +205,17 @@ View and change details of your project. In this tab you can:
206205

207206
Use the **Export** button on the **Project details** page of your labeling project. You can export the label data for Machine Learning experimentation at any time.
208207

209-
You can export:
210-
208+
For all project types other than **Text Named Entity Recognition**, you can export:
211209
* A CSV file. The CSV file is created in the default blob store of the Azure Machine Learning workspace in a folder within *Labeling/export/csv*.
212210
* An [Azure Machine Learning dataset with labels](how-to-use-labeled-dataset.md).
213211

212+
213+
For **Text Named Entity Recognition** projects, you can export:
214+
* An [Azure Machine Learning dataset with labels](how-to-use-labeled-dataset.md).
215+
* A CoNLL file. For this export, you'll also have to assign a compute resource. The export process runs offline and generates the file as part of an experiment run. When the file is ready to download, you'll see a notification on the top right. Select this to open the notification, which includes the link to the file.
216+
217+
:::image type="content" source="media/how-to-create-text-labeling-projects/notification-bar.png" alt-text="Notification for file download.":::
218+
214219
Access exported Azure Machine Learning datasets in the **Datasets** section of Machine Learning. The dataset details page also provides sample code to access your labels from Python.
215220

216221
![Exported dataset](./media/how-to-create-labeling-projects/exported-dataset.png)
@@ -219,7 +224,6 @@ Access exported Azure Machine Learning datasets in the **Datasets** section of M
219224

220225
[!INCLUDE [troubleshooting](../../includes/machine-learning-data-labeling-troubleshooting.md)]
221226

222-
223227
## Next steps
224228

225229
* [How to tag text](how-to-label-data.md#label-text)

0 commit comments

Comments
 (0)