Skip to content

Commit c4173e8

Browse files
Freshness.
1 parent 463f39e commit c4173e8

File tree

1 file changed

+17
-19
lines changed

1 file changed

+17
-19
lines changed

articles/machine-learning/how-to-prepare-datasets-for-automl-images.md

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.subservice: automl
99
ms.topic: how-to
1010
ms.custom: template-how-to, update-code, sdkv2
1111
ms.reviewer: rvadthyavath
12-
ms.date: 09/03/2024
12+
ms.date: 09/06/2024
1313
#customer intent: As a data scientist, I want to prepare image data for training computer vision models.
1414
---
1515

@@ -20,31 +20,29 @@ ms.date: 09/03/2024
2020
> [!IMPORTANT]
2121
> Support for training computer vision models with automated ML in Azure Machine Learning is an experimental public preview feature. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
2222
23-
In this article, you learn how to prepare image data for training computer vision models with [automated machine learning in Azure Machine Learning](concept-automated-ml.md).
23+
In this article, you learn how to prepare image data for training computer vision models with [automated machine learning in Azure Machine Learning](concept-automated-ml.md). To generate models for computer vision tasks with automated machine learning, you need to bring labeled image data as input for model training in the form of an `MLTable`.
2424

25-
To generate models for computer vision tasks with automated machine learning, you need to bring labeled image data as input for model training in the form of an `MLTable`.
26-
27-
You can create an `MLTable` from labeled training data in JSONL format. If your labeled training data is in a different format (like, Pascal Visual Object Classes (VOC) or COCO), you can use a [conversion script](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to first convert it to JSONL, and then create an `MLTable`. Alternatively, you can use Azure Machine Learning's data labeling tool to manually label images, and export the labeled data to use for training your AutoML model.
25+
You can create an `MLTable` from labeled training data in JSONL format. If your labeled training data is in a different format, like Pascal Visual Object Classes (VOC) or COCO, you can use a [conversion script](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert it to JSONL, and then create an `MLTable`. Alternatively, you can use Azure Machine Learning's data labeling tool to manually label images. Then export the labeled data to use for training your AutoML model.
2826

2927
## Prerequisites
3028

3129
- Familiarize yourself with the accepted [schemas for JSONL files for AutoML computer vision experiments](reference-automl-images-schema.md).
3230

3331
## Get labeled data
3432

35-
In order to train computer vision models using AutoML, you need to first get labeled training data. The images need to be uploaded to the cloud and label annotations need to be in JSONL format. You can either use the Azure Machine Learning Data Labeling tool to label your data or you could start with prelabeled image data.
33+
In order to train computer vision models using AutoML, you need to get labeled training data. The images need to be uploaded to the cloud. Label annotations need to be in JSONL format. You can either use the Azure Machine Learning Data Labeling tool to label your data or you could start with prelabeled image data.
3634

3735
### Use Azure Machine Learning Data Labeling tool to label your training data
3836

3937
If you don't have prelabeled data, you can use Azure Machine Learning's data labeling tool to manually label images. This tool automatically generates the data required for training in the accepted format. For more information, see [Set up an image labeling project](how-to-create-image-labeling-projects.md).
4038

41-
It helps to create, manage, and monitor data labeling tasks for:
39+
The tool helps to create, manage, and monitor data labeling tasks for:
4240

4341
- Image classification (multi-class and multi-label)
4442
- Object detection (bounding box)
4543
- Instance segmentation (polygon)
4644

47-
If you already labeled data you want to use, you can export your labeled data as an Azure Machine Learning Dataset and access the dataset under the **Datasets** tab in Azure Machine Learning studio. This exported dataset can then be passed as an input using `azureml:<tabulardataset_name>:<version>` format. For more information, see [Export the labels](how-to-manage-labeling-projects.md#export-the-labels).
45+
If you already labeled data you want to use, you can export your labeled data as an Azure Machine Learning Dataset and access the dataset under the **Datasets** tab in Azure Machine Learning studio. You can pass this exported dataset as an input using `azureml:<tabulardataset_name>:<version>` format. For more information, see [Export the labels](how-to-manage-labeling-projects.md#export-the-labels).
4846

4947
Here's an example of how to pass existing dataset as input for training computer vision models.
5048

@@ -82,11 +80,11 @@ Refer to CLI/SDK tabs for reference.
8280

8381
### Use prelabeled training data from local machine
8482

85-
If you labeled data that you would like to use to train your model, you need to upload the images to Azure. You can upload your images to the default Azure Blob Storage of your Azure Machine Learning Workspace and register it as a *data asset*. For more information, see [Create and manage data assets](how-to-create-data-assets.md).
83+
If you labeled data that you want to use to train your model, you need to upload the images to Azure. You can upload your images to the default Azure Blob Storage of your Azure Machine Learning Workspace. Register it as a *data asset*. For more information, see [Create and manage data assets](how-to-create-data-assets.md).
8684

8785
The following script uploads the image data on your local machine at path *./data/odFridgeObjects* to datastore in Azure Blob Storage. It then creates a new data asset with the name `fridge-items-images-object-detection` in your Azure Machine Learning Workspace.
8886

89-
If there already exists a data asset with the name `fridge-items-images-object-detection` in your Azure Machine Learning Workspace, it updates the version number of the data asset and points it to the new location where the image data uploaded.
87+
If there already exists a data asset with the name `fridge-items-images-object-detection` in your Azure Machine Learning Workspace, the code updates the version number of the data asset and points it to the new location where the image data uploaded.
9088

9189
# [Azure CLI](#tab/cli)
9290
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
@@ -119,7 +117,7 @@ az ml data create -f [PATH_TO_YML_FILE] --workspace-name [YOUR_AZURE_WORKSPACE]
119117

120118
---
121119

122-
If you already have your data present in an existing datastore and want to create a data asset out of it, you provide the path to the data in the datastore, instead of the path of your local machine. Update [the prededing code](#using-prelabeled-training-data-from-local-machine) with the following snippet.
120+
If you already have your data in an existing datastore, to create a data asset out of it, provide the path to the data in the datastore, instead of the path of your local machine. Update [the preceding code](#using-prelabeled-training-data-from-local-machine) with the following snippet.
123121

124122
# [Azure CLI](#tab/cli)
125123
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
@@ -151,17 +149,17 @@ my_data = Data(
151149

152150
---
153151

154-
Next, you need to get the label annotations in JSONL format. The schema of labeled data depends on the computer vision task at hand. To learn more about the required JSONL schema for each task type, see [Data schemas to train computer vision models with automated machine learning](reference-automl-images-schema.md).
152+
Next, get the label annotations in JSONL format. The schema of labeled data depends on the computer vision task at hand. To learn more about the required JSONL schema for each task type, see [Data schemas to train computer vision models with automated machine learning](reference-automl-images-schema.md).
155153

156-
If your training data is in a different format (like, pascal VOC or COCO), [helper scripts](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert the data to JSONL are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs).
154+
If your training data is in a different format, like pascal VOC or COCO, [helper scripts](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) can convert the data to JSONL. The scripts are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs).
157155

158-
After you create the *.jsonl* file following the preceding steps, you can register it as a data asset using UI. Make sure you select `stream` type in schema section as shown in this animation.
156+
After you create the *.jsonl* file, you can register it as a data asset using UI. Make sure that you select `stream` type in schema section as shown in this animation.
159157

160158
:::image type="content" source="media\how-to-prepare-datasets-for-automl-images\ui-dataset-jsnol.gif" alt-text="Animation showing how to register a data asset from the jsonl files.":::
161159

162-
### Using prelabeled training data from Azure Blob storage
160+
### Use prelabeled training data from Azure Blob storage
163161

164-
If your labeled training data is present in a container in Azure Blob storage, you can access it directly from there by [creating a datastore referring to that container](how-to-datastore.md#create-an-azure-blob-datastore).
162+
If your labeled training data is present in a container in Azure Blob storage, you can access it directly. Create a datastore to that container. For more information, see [Create and manage data assets](how-to-datastore.md#create-an-azure-blob-datastore).
165163

166164
## Create MLTable
167165

@@ -184,6 +182,6 @@ You can then pass in the `MLTable` as a data input for your AutoML training job.
184182

185183
## Related content
186184

187-
- [Train computer vision models with automated machine learning](how-to-auto-train-image-models.md).
188-
- [Train a small object detection model with automated machine learning](how-to-use-automl-small-object-detect.md).
189-
- [Tutorial: Train an object detection model (preview) with AutoML and Python](tutorial-auto-train-image-models.md).
185+
- [Set up AutoML to train computer vision models](how-to-auto-train-image-models.md).
186+
- [Train a small object detection model with AutoML](how-to-use-automl-small-object-detect.md).
187+
- [Tutorial: Train an object detection model with AutoML and Python](tutorial-auto-train-image-models.md).

0 commit comments

Comments
 (0)