You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-create-text-labeling-projects.md
+28-24Lines changed: 28 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,26 +7,21 @@ ms.author: sgilley
7
7
ms.service: machine-learning
8
8
ms.subservice: mldata
9
9
ms.topic: how-to
10
-
ms.date: 10/21/2021
10
+
ms.date: 03/18/2022
11
11
ms.custom: data4ml, ignite-fall-2021
12
12
---
13
13
14
-
# Create a text labeling project and export labels (preview)
14
+
# Create a text labeling project and export labels
15
15
16
16
Learn how to create and run data labeling projects to label text data in Azure Machine Learning. Specify either a single label or multiple labels to be applied to each text item.
17
17
18
18
You can also use the data labeling tool to [create an image labeling project](how-to-create-image-labeling-projects.md).
19
19
20
-
> [!IMPORTANT]
21
-
> Text labeling is currently in public preview.
22
-
> The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
23
-
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
24
-
25
20
## Text labeling capabilities
26
21
27
22
Azure Machine Learning data labeling is a central place to create, manage, and monitor data labeling projects:
28
23
29
-
- Coordinate data, labels, and team members to efficiently manage labeling tasks.
24
+
- Coordinate data, labels, and team members to efficiently manage labeling tasks.
30
25
- Tracks progress and maintains the queue of incomplete labeling tasks.
31
26
- Start and stop the project and control the labeling progress.
32
27
- Review the labeled data and export labeled as an Azure Machine Learning dataset.
@@ -47,22 +42,28 @@ Data formats available for text data:
1. To create a project, select **Add project**. Give the project an appropriate name. The project name cannot be reused, even if the project is deleted in future.
45
+
1. To create a project, select **Add project**. Give the project an appropriate name. The project name can’t be reused, even if the project is deleted in future.
51
46
52
47
1. Select **Text** to create a text labeling project.
53
48
54
-
:::image type="content" source="media/how-to-create-labeling-projects/text-labeling-creation-wizard.png" alt-text="Labeling project creation for text labeling":::
49
+
:::image type="content" source="media/how-to-create-text-labeling-projects/text-labeling-creation-wizard.png" alt-text="Labeling project creation for text labeling":::
50
+
51
+
* Choose **Text Classification Multi-class** for projects when you want to apply only a *single label* from a set of labels to each piece of text.
52
+
* Choose **Text Classification Multi-label** for projects when you want to apply *one or more* labels from a set of labels to each piece of text.
53
+
* Choose **Text Named Entity Recognition (Preview)** for projects when you want to apply labels to individual or multiple words of text in each entry.
55
54
56
-
* Choose **Text Classification Multi-class (Preview)** for projects when you want to apply only a *single label* from a set of labels to each piece of text.
57
-
* Choose **Text Classification Multi-label (Preview)** for projects when you want to apply *one or more* labels from a set of labels to each piece of text.
55
+
> [!IMPORTANT]
56
+
> Text Named Entity Recognition is currently in public preview.
57
+
> The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
58
+
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
If you already created a dataset that contains your data, select it from the **Select an existing dataset** drop-down list. Or, select **Create a dataset** to use an existing Azure datastore or to upload local files.
68
69
@@ -78,7 +79,7 @@ To create a dataset from data that you've already stored in Azure Blob storage:
78
79
1. Select **Create a dataset** > **From datastore**.
79
80
1. Assign a **Name** to your dataset.
80
81
1. Choose the **Dataset type**:
81
-
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response.
82
+
* Select **Tabular** if you're using a .csv or .tsv file, where each row contains a response. Tabular isn't available for Text Named Entity Recognition projects.
82
83
* Select **File** if you're using separate .txt files for each response.
83
84
1. (Optional) Provide a description for your dataset.
84
85
1. Select **Next**.
@@ -96,7 +97,7 @@ To directly upload your data:
96
97
1. Select **Create a dataset** > **From local files**.
97
98
1. Assign a **Name** to your dataset.
98
99
1. Choose the **Dataset type**.
99
-
* Select **Tabular** if you're using a .csv or .tsv file, where each row is a response.
100
+
* Select **Tabular** if you're using a .csv or .tsv file, where each row is a response. Tabular isn't available for Text Named Entity Recognition projects.
100
101
* Select **File** if you're using separate .txt files for each response.
101
102
1. (Optional) Provide a description of your dataset.
102
103
1. Select **Next**
@@ -131,7 +132,6 @@ To directly upload your data:
131
132
## Use ML-assisted data labeling
132
133
133
134
The **ML-assisted labeling** page lets you trigger automatic machine learning models to accelerate labeling tasks. ML-assisted labeling is available for both file (.txt) and tabular (.csv) text data inputs.
134
-
135
135
To use **ML-assisted labeling**:
136
136
137
137
* Select **Enable ML assisted labeling**.
@@ -144,7 +144,7 @@ At the beginning of your labeling project, the items are shuffled into a random
144
144
145
145
For training the text DNN model used by ML-assist, the input text per training example will be limited to approximately the first 128 words in the document. For tabular input, all text columns are first concatenated before applying this limit. This is a practical limit imposed to allow for the model training to complete in a timely manner. The actual text in a document (for file input) or set of text columns (for tabular input) can exceed 128 words. The limit only pertains to what is internally leveraged by the model during the training process.
146
146
147
-
The exact number of labeled items necessary to start assisted labeling is not a fixed number. This can vary significantly from one labeling project to another, depending on many factors, including the number of labels classes and label distribution.
147
+
The exact number of labeled items necessary to start assisted labeling isn't a fixed number. This can vary significantly from one labeling project to another, depending on many factors, including the number of labels classes and label distribution.
148
148
149
149
Since the final labels still rely on input from the labeler, this technology is sometimes called *human in the loop* labeling.
150
150
@@ -167,13 +167,12 @@ Once a machine learning model has been trained on your manually labeled data, th
167
167
168
168
### Dashboard
169
169
170
-
171
170
The **Dashboard** tab shows the progress of the labeling task.
172
171
173
-
:::image type="content" source="./media/how-to-create-labeling-projects/text-labeling-dashboard.png" alt-text="Text data labeling dashboard":::
172
+
:::image type="content" source="./media/how-to-create-text-labeling-projects/text-labeling-dashboard.png" alt-text="Text data labeling dashboard":::
174
173
175
174
176
-
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of item in each section.
175
+
The progress chart shows how many items have been labeled, skipped, in need of review, or not yet done. Hover over the chart to see the number of items in each section.
177
176
178
177
The middle section shows the queue of tasks yet to be assigned. If ML-assisted labeling is on, you'll also see the number of pre-labeled items.
179
178
@@ -189,7 +188,7 @@ On the **Data** tab, you can see your dataset and review labeled data. Scroll th
189
188
View and change details of your project. In this tab you can:
190
189
191
190
* View project details and input datasets
192
-
* Enable or disable **incremental refresh at regular intervals** or request an immediate refresh
191
+
* Enable or disable **incremental refresh at regular intervals**, or request an immediate refresh.
193
192
* View details of the storage container used to store labeled outputs in your project
194
193
* Add labels to your project
195
194
* Edit instructions you give to your labels
@@ -206,11 +205,17 @@ View and change details of your project. In this tab you can:
206
205
207
206
Use the **Export** button on the **Project details** page of your labeling project. You can export the label data for Machine Learning experimentation at any time.
208
207
209
-
You can export:
210
-
208
+
For all project types other than **Text Named Entity Recognition**, you can export:
211
209
* A CSV file. The CSV file is created in the default blob store of the Azure Machine Learning workspace in a folder within *Labeling/export/csv*.
212
210
* An [Azure Machine Learning dataset with labels](how-to-use-labeled-dataset.md).
213
211
212
+
213
+
For **Text Named Entity Recognition** projects, you can export:
214
+
* An [Azure Machine Learning dataset with labels](how-to-use-labeled-dataset.md).
215
+
* A CoNLL file. For this export, you'll also have to assign a compute resource. The export process runs offline and generates the file as part of an experiment run. When the file is ready to download, you'll see a notification on the top right. Select this to open the notification, which includes the link to the file.
216
+
217
+
:::image type="content" source="media/how-to-create-text-labeling-projects/notification-bar.png" alt-text="Notification for file download.":::
218
+
214
219
Access exported Azure Machine Learning datasets in the **Datasets** section of Machine Learning. The dataset details page also provides sample code to access your labels from Python.
0 commit comments