Skip to content

Commit 3582085

Browse files
authored
Merge pull request #156 from TimShererWithAquent/us302403a
AI Freshness - Machine Learning how-to
2 parents 236f1b1 + d08f94e commit 3582085

File tree

1 file changed

+44
-43
lines changed

1 file changed

+44
-43
lines changed
Lines changed: 44 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,50 @@
11
---
22
title: Prepare data for computer vision tasks
33
titleSuffix: Azure Machine Learning
4-
description: Image data preparation for Azure Machine Learning automated ML to train computer vision models on classification, object detection, and segmentation
4+
description: Learn about image data preparation for Azure Machine Learning to train computer vision models on classification, object detection, and segmentation.
55
author: ssalgadodev
66
ms.author: ssalgado
77
ms.service: azure-machine-learning
88
ms.subservice: automl
99
ms.topic: how-to
10-
ms.custom: template-how-to, update-code, sdkv2,
10+
ms.custom: template-how-to, update-code, sdkv2
1111
ms.reviewer: rvadthyavath
12-
ms.date: 03/26/2024
12+
ms.date: 09/06/2024
13+
#customer intent: As a data scientist, I want to prepare image data for training computer vision models.
1314
---
1415

1516
# Prepare data for computer vision tasks with automated machine learning
1617

1718
[!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
1819

19-
2020
> [!IMPORTANT]
2121
> Support for training computer vision models with automated ML in Azure Machine Learning is an experimental public preview feature. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
2222
23-
In this article, you learn how to prepare image data for training computer vision models with [automated machine learning in Azure Machine Learning](concept-automated-ml.md).
24-
25-
To generate models for computer vision tasks with automated machine learning, you need to bring labeled image data as input for model training in the form of an `MLTable`.
23+
In this article, you learn how to prepare image data for training computer vision models with [automated machine learning in Azure Machine Learning](concept-automated-ml.md). To generate models for computer vision tasks with automated machine learning, you need to bring labeled image data as input for model training in the form of an `MLTable`.
2624

27-
You can create an `MLTable` from labeled training data in JSONL format.
28-
If your labeled training data is in a different format (like, pascal VOC or COCO), you can use a [conversion script](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to first convert it to JSONL, and then create an `MLTable`. Alternatively, you can use Azure Machine Learning's [data labeling tool](how-to-create-image-labeling-projects.md) to manually label images, and export the labeled data to use for training your AutoML model.
25+
You can create an `MLTable` from labeled training data in JSONL format. If your labeled training data is in a different format, like Pascal Visual Object Classes (VOC) or COCO, you can use a [conversion script](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert it to JSONL, and then create an `MLTable`. Alternatively, you can use Azure Machine Learning's data labeling tool to manually label images. Then export the labeled data to use for training your AutoML model.
2926

3027
## Prerequisites
3128

32-
* Familiarize yourself with the accepted [schemas for JSONL files for AutoML computer vision experiments](reference-automl-images-schema.md).
29+
- Familiarize yourself with the accepted [schemas for JSONL files for AutoML computer vision experiments](reference-automl-images-schema.md).
30+
31+
## Get labeled data
32+
33+
In order to train computer vision models using AutoML, you need to get labeled training data. The images need to be uploaded to the cloud. Label annotations need to be in JSONL format. You can either use the Azure Machine Learning Data Labeling tool to label your data or you could start with prelabeled image data.
3334

34-
## Get labeled data
35-
In order to train computer vision models using AutoML, you need to first get labeled training data. The images need to be uploaded to the cloud and label annotations need to be in JSONL format. You can either use the Azure Machine Learning Data Labeling tool to label your data or you could start with prelabeled image data.
35+
### Use Azure Machine Learning Data Labeling tool to label your training data
3636

37-
### Using Azure Machine Learning Data Labeling tool to label your training data
38-
If you don't have prelabeled data, you can use Azure Machine Learning's [data labeling tool](how-to-create-image-labeling-projects.md) to manually label images. This tool automatically generates the data required for training in the accepted format.
37+
If you don't have prelabeled data, you can use Azure Machine Learning's data labeling tool to manually label images. This tool automatically generates the data required for training in the accepted format. For more information, see [Set up an image labeling project](how-to-create-image-labeling-projects.md).
3938

40-
It helps to create, manage, and monitor data labeling tasks for
39+
The tool helps to create, manage, and monitor data labeling tasks for:
4140

42-
+ Image classification (multi-class and multi-label)
43-
+ Object detection (bounding box)
44-
+ Instance segmentation (polygon)
41+
- Image classification (multi-class and multi-label)
42+
- Object detection (bounding box)
43+
- Instance segmentation (polygon)
4544

46-
If you already have labeled data you want to use, you can [export your labeled data as an Azure Machine Learning Dataset](how-to-manage-labeling-projects.md#export-the-labels) and then access the dataset under 'Datasets' tab in Azure Machine Learning studio. This exported dataset can then be passed as an input using `azureml:<tabulardataset_name>:<version>` format. Here's an example of how to pass existing dataset as input for training computer vision models.
45+
If you already have labeled data to use, export that labeled data as an Azure Machine Learning Dataset and access the dataset under the **Datasets** tab in Azure Machine Learning studio. You can pass this exported dataset as an input using `azureml:<tabulardataset_name>:<version>` format. For more information, see [Export the labels](how-to-manage-labeling-projects.md#export-the-labels).
46+
47+
Here's an example of how to pass existing dataset as input for training computer vision models.
4748

4849
# [Azure CLI](#tab/cli)
4950

@@ -77,18 +78,18 @@ Refer to CLI/SDK tabs for reference.
7778

7879
---
7980

80-
### Using prelabeled training data from local machine
81-
If you have labeled data that you would like to use to train your model, you need to upload the images to Azure. You can upload the your images to the default Azure Blob Storage of your Azure Machine Learning Workspace and register it as a [data asset](how-to-create-data-assets.md).
81+
### Use prelabeled training data from local machine
8282

83-
The following script uploads the image data on your local machine at path "./data/odFridgeObjects" to datastore in Azure Blob Storage. It then creates a new data asset with the name "fridge-items-images-object-detection" in your Azure Machine Learning Workspace.
83+
If you have labeled data that you want to use to train your model, upload the images to Azure. You can upload your images to the default Azure Blob Storage of your Azure Machine Learning Workspace. Register it as a *data asset*. For more information, see [Create and manage data assets](how-to-create-data-assets.md).
8484

85+
The following script uploads the image data on your local machine at path *./data/odFridgeObjects* to datastore in Azure Blob Storage. It then creates a new data asset with the name `fridge-items-images-object-detection` in your Azure Machine Learning Workspace.
8586

86-
If there already exists a data asset with the name "fridge-items-images-object-detection" in your Azure Machine Learning Workspace, it updates the version number of the data asset and points it to the new location where the image data uploaded.
87+
If there already exists a data asset with the name `fridge-items-images-object-detection` in your Azure Machine Learning Workspace, the code updates the version number of the data asset and points it to the new location where the image data uploaded.
8788

8889
# [Azure CLI](#tab/cli)
8990
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
9091

91-
Create an .yml file with the following configuration.
92+
Create an *.yml* file with the following configuration.
9293

9394
```yml
9495
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
@@ -98,30 +99,30 @@ path: ./data/odFridgeObjects
9899
type: uri_folder
99100
```
100101
101-
To upload the images as a data asset, you run the following CLI v2 command with the path to your .yml file, workspace name, resource group, and subscription ID.
102+
To upload the images as a data asset, run the following CLI v2 command with the path to your *.yml* file, workspace name, resource group, and subscription ID.
102103
103104
```azurecli
104105
az ml data create -f [PATH_TO_YML_FILE] --workspace-name [YOUR_AZURE_WORKSPACE] --resource-group [YOUR_AZURE_RESOURCE_GROUP] --subscription [YOUR_AZURE_SUBSCRIPTION]
105106
```
106107

107108
# [Python SDK](#tab/python)
108109

109-
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
110+
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
110111

111112
[!Notebook-python[] (~/azureml-examples-main/sdk/python/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=upload-data)]
112113

113114
# [Studio](#tab/Studio)
114115

115-
![Animation showing how to register a dataset from local files](media\how-to-prepare-datasets-for-automl-images\ui-dataset-local.gif)
116+
:::image type="content" source="media\how-to-prepare-datasets-for-automl-images\ui-dataset-local.gif" alt-text="Animation showing how to register a dataset from local files.":::
116117

117118
---
118119

119-
If you already have your data present in an existing datastore and want to create a data asset out of it, you can do so by providing the path to the data in the datastore, instead of providing the path of your local machine. Update the code [above](#using-prelabeled-training-data-from-local-machine) with the following snippet.
120+
If you already have your data in an existing datastore, you can create a data asset out of it. Provide the path to the data in the datastore instead of the path of your local machine. Update [the preceding code](#use-prelabeled-training-data-from-local-machine) with the following snippet.
120121

121122
# [Azure CLI](#tab/cli)
122123
[!INCLUDE [cli v2](includes/machine-learning-cli-v2.md)]
123124

124-
Create an .yml file with the following configuration.
125+
Create a *.yml* file with the following configuration.
125126

126127
```yml
127128
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
@@ -133,7 +134,6 @@ type: uri_folder
133134
134135
# [Python SDK](#tab/python)
135136
136-
137137
```Python
138138
my_data = Data(
139139
path="azureml://subscriptions/<my-subscription-id>/resourcegroups/<my-resource-group>/workspaces/<my-workspace>/datastores/<my-datastore>/paths/<path_to_image_data_folder>",
@@ -145,24 +145,25 @@ my_data = Data(
145145

146146
# [Studio](#tab/Studio)
147147

148-
![Animation showing how to register a dataset from data already present in datastore](media\how-to-prepare-datasets-for-automl-images\ui-dataset-datastore.gif)
148+
:::image type="content" source="media\how-to-prepare-datasets-for-automl-images\ui-dataset-datastore.gif" alt-text="Animation showing how to register a dataset from data already present in datastore.":::
149149

150150
---
151151

152-
Next, you need to get the label annotations in JSONL format. The schema of labeled data depends on the computer vision task at hand. Refer to [schemas for JSONL files for AutoML computer vision experiments](reference-automl-images-schema.md) to learn more about the required JSONL schema for each task type.
152+
Next, get the label annotations in JSONL format. The schema of labeled data depends on the computer vision task at hand. To learn more about the required JSONL schema for each task type, see [Data schemas to train computer vision models with automated machine learning](reference-automl-images-schema.md).
153+
154+
If your training data is in a different format, like pascal VOC or COCO, [helper scripts](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) can convert the data to JSONL. The scripts are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs).
153155

154-
If your training data is in a different format (like, pascal VOC or COCO), [helper scripts](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert the data to JSONL are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs).
156+
After you create the *.jsonl* file, you can register it as a data asset using the UI. Make sure that you select `stream` type in schema section as shown in this animation.
155157

156-
Once you created jsonl file following the above steps, you can register it as a data asset using UI. Make sure you select `stream` type in schema section as shown in this animation.
158+
:::image type="content" source="media\how-to-prepare-datasets-for-automl-images\ui-dataset-jsnol.gif" alt-text="Animation showing how to register a data asset from the jsonl files.":::
157159

158-
![Animation showing how to register a data asset from the jsonl files](media\how-to-prepare-datasets-for-automl-images\ui-dataset-jsnol.gif)
160+
### Use prelabeled training data from Azure Blob storage
159161

160-
### Using prelabeled training data from Azure Blob storage
161-
If you have your labeled training data present in a container in Azure Blob storage, then you can access it directly from there by [creating a datastore referring to that container](how-to-datastore.md#create-an-azure-blob-datastore).
162+
If your labeled training data is present in a container in Azure Blob storage, you can access it directly. Create a datastore to that container. For more information, see [Create and manage data assets](how-to-datastore.md#create-an-azure-blob-datastore).
162163

163164
## Create MLTable
164165

165-
Once you have your labeled data in JSONL format, you can use it to create `MLTable` as shown in this yaml snippet. MLtable packages your data into a consumable object for training.
166+
After your labeled data is in JSONL format, you can use it to create `MLTable` as shown in this yaml snippet. MLtable packages your data into a consumable object for training.
166167

167168
```yaml
168169
paths:
@@ -177,10 +178,10 @@ transformations:
177178
column_type: stream_info
178179
```
179180
180-
You can then pass in the `MLTable` as a [data input for your AutoML training job](./how-to-auto-train-image-models.md#consume-data).
181+
You can then pass in the `MLTable` as a data input for your AutoML training job. For more information, see [Set up AutoML to train computer vision models](./how-to-auto-train-image-models.md#consume-data).
181182

182-
## Next steps
183+
## Related content
183184

184-
* [Train computer vision models with automated machine learning](how-to-auto-train-image-models.md).
185-
* [Train a small object detection model with automated machine learning](how-to-use-automl-small-object-detect.md).
186-
* [Tutorial: Train an object detection model (preview) with AutoML and Python](tutorial-auto-train-image-models.md).
185+
- [Set up AutoML to train computer vision models](how-to-auto-train-image-models.md).
186+
- [Train a small object detection model with AutoML](how-to-use-automl-small-object-detect.md).
187+
- [Tutorial: Train an object detection model with AutoML and Python](tutorial-auto-train-image-models.md).

0 commit comments

Comments
 (0)