Skip to content

Commit b81b814

Browse files
committed
edits per Sheri
1 parent 8f90f00 commit b81b814

File tree

3 files changed

+41
-41
lines changed

3 files changed

+41
-41
lines changed

articles/machine-learning/how-to-create-image-labeling-projects.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Set up an image labeling project
33
titleSuffix: Azure Machine Learning
4-
description: Learn how to create a project to label images by using the data labeling tool. Enable machine learning-assisted labeling or human-in-the-loop labeling to help with the task.
4+
description: Learn how to create a project and use the data labeling tool to label images in the project. Enable machine learning-assisted labeling or human-in-the-loop labeling to help with the task.
55
author: kvijaykannan
66
ms.author: vkann
77
ms.reviewer: sgilley
@@ -14,11 +14,11 @@ ms.custom: data4ml, ignite-fall-2021, ignite-2022
1414

1515
# Set up an image labeling project and export labels
1616

17-
Learn how to create and run data labeling projects to label images in Azure Machine Learning. Use machine learning (ML)-assisted data labeling or human-in-the-loop labeling to help with the task.
17+
Learn how to create and run data labeling projects to label images in Azure Machine Learning. Use machine learning-assisted data labeling or human-in-the-loop labeling to help with the task.
1818

1919
Set up labels for classification, object detection (bounding box), or instance segmentation (polygon).
2020

21-
You can also use the data labeling tool in Machine Learning to [create a text labeling project](how-to-create-text-labeling-projects.md).
21+
You can also use the data labeling tool in Azure Machine Learning to [create a text labeling project](how-to-create-text-labeling-projects.md).
2222

2323
## Image labeling capabilities
2424

@@ -30,7 +30,7 @@ Azure Machine Learning data labeling is a tool you can use to create, manage, an
3030
- Review and export the labeled data as an Azure Machine Learning dataset.
3131

3232
> [!IMPORTANT]
33-
> The data images you work with in the Machine Learning data labeling tool must be available in an Azure Blob Storage datastore. If you don't have an existing datastore, you can upload your data files to a new datastore when you create a project.
33+
> The data images you work with in the Azure Machine Learning data labeling tool must be available in an Azure Blob Storage datastore. If you don't have an existing datastore, you can upload your data files to a new datastore when you create a project.
3434
3535
Image data can be any file that has one of these file extensions:
3636

@@ -49,7 +49,7 @@ Each file is an item to be labeled.
4949

5050
## Prerequisites
5151

52-
You use these items to set up image labeling in Machine Learning:
52+
You use these items to set up image labeling in Azure Machine Learning:
5353

5454
[!INCLUDE [prerequisites](../../includes/machine-learning-data-labeling-prerequisites.md)]
5555

@@ -157,41 +157,41 @@ For bounding boxes, important questions include:
157157
> [!NOTE]
158158
> **Instance Segmentation** projects can't use consensus labeling.
159159
160-
## Use ML-assisted data labeling
160+
## Use machine learning-assisted data labeling
161161

162-
To accelerate labeling tasks, on the **ML assisted labeling** page, you can trigger automatic ML models. Medical images (files that have a *.dcm* extension) aren't included in assisted labeling.
162+
To accelerate labeling tasks, on the **ML assisted labeling** page, you can trigger automatic machine learning models. Medical images (files that have a *.dcm* extension) aren't included in assisted labeling.
163163

164164
At the start of your labeling project, the items are shuffled into a random order to reduce potential bias. However, the trained model reflects any biases that are present in the dataset. For example, if 80 percent of your items are of a single class, then approximately 80 percent of the data used to train the model lands in that class.
165165

166166
To enable assisted labeling, select **Enable ML assisted labeling** and specify a GPU. If you don't have a GPU in your workspace, a GPU cluster is created for you and added to your workspace. The cluster is created with a minimum of zero nodes, which means it costs nothing when not in use.
167167

168-
ML-assisted labeling consists of two phases:
168+
Machine learning-assisted labeling consists of two phases:
169169

170170
* Clustering
171171
* Pre-labeling
172172

173-
The labeled data item count that's required to start assisted labeling isn't a fixed number. This number can vary significantly from one labeling project to another. For some projects, it's sometimes possible to see pre-label or cluster tasks after 300 items have been manually labeled. ML-assisted labeling uses a technique called *transfer learning*. Transfer learning uses a pre-trained model to jump-start the training process. If the classes of your dataset resemble the classes in the pre-trained model, pre-labels might become available after only a few hundred manually labeled items. If your dataset significantly differs from the data that's used to pre-train the model, the process might take more time.
173+
The labeled data item count that's required to start assisted labeling isn't a fixed number. This number can vary significantly from one labeling project to another. For some projects, it's sometimes possible to see pre-label or cluster tasks after 300 items have been manually labeled. Machine learning-assisted labeling uses a technique called *transfer learning*. Transfer learning uses a pre-trained model to jump-start the training process. If the classes of your dataset resemble the classes in the pre-trained model, pre-labels might become available after only a few hundred manually labeled items. If your dataset significantly differs from the data that's used to pre-train the model, the process might take more time.
174174

175175
When you use consensus labeling, the consensus label is used for training.
176176

177177
Because the final labels still rely on input from the labeler, this technology is sometimes called *human-in-the-loop* labeling.
178178

179179
> [!NOTE]
180-
> ML-assisted data labeling doesn't support default storage accounts that are secured behind a [virtual network](how-to-network-security-overview.md). You must use a non-default storage account for ML-assisted data labeling. The non-default storage account can be secured behind the virtual network.
180+
> Machine learning-assisted data labeling doesn't support default storage accounts that are secured behind a [virtual network](how-to-network-security-overview.md). You must use a non-default storage account for machine learning-assisted data labeling. The non-default storage account can be secured behind the virtual network.
181181
182182
### Clustering
183183

184-
After you submit some labels, the classification ML model starts to group together similar items. These similar images are presented to labelers on the same page to help make manual tagging more efficient. Clustering is especially useful when a labeler views a grid of four, six, or nine images.
184+
After you submit some labels, the classification model starts to group together similar items. These similar images are presented to labelers on the same page to help make manual tagging more efficient. Clustering is especially useful when a labeler views a grid of four, six, or nine images.
185185

186-
After an ML model is trained on your manually labeled data, the model is truncated to its last fully connected layer. Unlabeled images are then passed through the truncated model in a process called *embedding* or *featurization*. This process embeds each image in a high-dimensional space that the model layer defines. Other images in the space that are nearest the image are used for clustering tasks.
186+
After a machine learning model is trained on your manually labeled data, the model is truncated to its last fully connected layer. Unlabeled images are then passed through the truncated model in a process called *embedding* or *featurization*. This process embeds each image in a high-dimensional space that the model layer defines. Other images in the space that are nearest the image are used for clustering tasks.
187187

188188
The clustering phase doesn't appear for object detection models or text classification.
189189

190190
### Pre-labeling
191191

192192
After you submit enough labels for training, either a classification model predicts tags or an object detection model predicts bounding boxes. The labeler now sees pages that contain predicted labels already present on each item. For object detection, predicted boxes are also shown. The task involves reviewing these predictions and correcting any incorrectly labeled images before page submission.
193193

194-
After an ML model is trained on your manually labeled data, the model is evaluated on a test set of manually labeled items. The evaluation helps determine the model's accuracy at different confidence thresholds. The evaluation process sets a confidence threshold beyond which the model is accurate enough to show pre-labels. The model is then evaluated against unlabeled data. Items with predictions that are more confident than the threshold are used for pre-labeling.
194+
After a machine learning model is trained on your manually labeled data, the model is evaluated on a test set of manually labeled items. The evaluation helps determine the model's accuracy at different confidence thresholds. The evaluation process sets a confidence threshold beyond which the model is accurate enough to show pre-labels. The model is then evaluated against unlabeled data. Items with predictions that are more confident than the threshold are used for pre-labeling.
195195

196196
## Initialize the image labeling project
197197

@@ -213,14 +213,14 @@ A distribution of the labels for completed tasks is shown below the chart. In so
213213

214214
A distribution of labelers and how many items they've labeled also are shown.
215215

216-
The middle section shows a table that has a queue of unassigned tasks. When ML-assisted labeling is off, this section shows the number of manual tasks that are awaiting assignment.
216+
The middle section shows a table that has a queue of unassigned tasks. When machine learning-assisted labeling is off, this section shows the number of manual tasks that are awaiting assignment.
217217

218-
When ML-assisted labeling is on, this section also shows:
218+
When machine learning-assisted labeling is on, this section also shows:
219219

220220
* Tasks that contain clustered items in the queue.
221221
* Tasks that contain pre-labeled items in the queue.
222222

223-
Additionally, when ML-assisted labeling is enabled, you can scroll down to see the ML-assisted labeling status. The **Jobs** sections give links for each of the ML runs.
223+
Additionally, when machine learning-assisted labeling is enabled, you can scroll down to see the machine learning-assisted labeling status. The **Jobs** sections give links for each of the machine learning runs.
224224

225225
* **Training**: Trains a model to predict the labels.
226226
* **Validation**: Determines whether item pre-labeling uses the prediction of this model.
@@ -235,7 +235,7 @@ If your project uses consensus labeling, review images that have no consensus:
235235

236236
1. Select the **Data** tab.
237237
1. On the left menu, select **Review labels**.
238-
1. On the project command bar, select **All filters**.
238+
1. On the command bar above **Review labels**, select **All filters**.
239239

240240
:::image type="content" source="media/how-to-create-labeling-projects/select-filters.png" alt-text="Screenshot that shows how to select filters to review consensus label problems." lightbox="media/how-to-create-labeling-projects/select-filters.png":::
241241

@@ -247,7 +247,7 @@ If your project uses consensus labeling, review images that have no consensus:
247247

248248
:::image type="content" source="media/how-to-create-labeling-projects/consensus-dropdown.png" alt-text="Screenshot that shows the Select Consensus label dropdown to review conflicting labels." lightbox="media/how-to-create-labeling-projects/consensus-dropdown.png":::
249249

250-
1. Although you can see an image's labels when you select the image, to update or reject the labels, you must use the **Consensus label (preview)** option.
250+
1. Although you can select an individual labeler to see their labels, to update or reject the labels, you must use the **Consensus label (preview)** option.
251251

252252
### Details tab
253253

@@ -258,7 +258,7 @@ View and change details of your project. On this tab, you can:
258258
* View details of the storage container that's used to store labeled outputs in your project.
259259
* Add labels to your project.
260260
* Edit instructions you give to your labels.
261-
* Change settings for ML-assisted labeling and kick off a labeling task.
261+
* Change settings for machine learning-assisted labeling and kick off a labeling task.
262262

263263
### Access for labelers
264264

@@ -268,7 +268,7 @@ View and change details of your project. On this tab, you can:
268268

269269
[!INCLUDE [add-label](../../includes/machine-learning-data-labeling-add-label.md)]
270270

271-
## Start an ML-assisted labeling task
271+
## Start a machine learning-assisted labeling task
272272

273273
[!INCLUDE [start-ml-assist](../../includes/machine-learning-data-labeling-start-ml-assist.md)]
274274

@@ -295,7 +295,7 @@ After you export your labeled data to an Azure Machine Learning dataset, you can
295295

296296
|Issue |Resolution |
297297
|---------|---------|
298-
|To create a zero-size label, select the Esc key when you label for object detection. Label submission in this state fails.|To delete the label, select the **X** delete icon next to the label.|
298+
|If you select the Esc key when you label for object detection, a zero-size label is created and label submission fails.|To delete the label, select the **X** delete icon next to the label.|
299299

300300
## Next steps
301301

0 commit comments

Comments
 (0)