Skip to content

Commit 8409532

Browse files
authored
Use existing Blob storage as datastore
Provide description on how to use existing Azure Blob storage as datastore in AML workspace
1 parent 15a9f56 commit 8409532

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

articles/machine-learning/how-to-prepare-datasets-for-automl-images.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,12 @@ It helps to create, manage, and monitor data labeling tasks for
4747

4848
If you already have a data labeling project and you want to use that data, you can [export your labeled data as an Azure ML Dataset](how-to-create-image-labeling-projects.md#export-the-labels). You can then access the exported dataset under the 'Datasets' tab in Azure ML Studio, and download the underlying JSONL file from the Dataset details page under Data sources. The downloaded JSONL file can then be used to create an `MLTable` that can be used by automated ML for training computer vision models.
4949

50-
### Using pre-labeled training data
50+
### Using pre-labeled training data from local machine
5151
If you have previously labeled data that you would like to use to train your model, you will first need to upload the images to the default Azure Blob Storage of your Azure ML Workspace and register it as a data asset.
5252

5353
Below scripts uploads the image data on your local machine at path "./data/odFridgeObjects" to datastore in Azure Blob Storage. Thereafter, it creates a new data asset with the name "fridge-items-images-object-detection" in your Azure ML Workspace.
5454

55-
If there already exists a data asset with name "fridge-items-images-object-detection" in your Azure ML Workspace, then it'll update its version number of data asset and make it point to new datastore in Azure Blob Storage where we uploaded the image data.
55+
If there already exists a data asset with name "fridge-items-images-object-detection" in your Azure ML Workspace, then it'll update its version number of data asset and make it point to new location in datastore in Azure Blob Storage where we uploaded the image data.
5656

5757
# [Azure CLI](#tab/cli)
5858
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
@@ -80,7 +80,7 @@ az ml data create -f [PATH_TO_YML_FILE] --workspace-name [YOUR_AZURE_WORKSPACE]
8080
[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=upload-data)]
8181
---
8282

83-
If you already have your data present in Azure Blob Storage and want to create data asset out of it, you can do so by providing path to the location in Azure Blob Storage as shown below.
83+
If you already have your data present in an existing datastore and want to create data asset out of it, you can do so by providing path to the data in datastore as shown below, instead of providing path on your local machine.
8484

8585
# [Azure CLI](#tab/cli)
8686
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
@@ -91,7 +91,7 @@ Create a .yml file with the following configuration.
9191
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
9292
name: fridge-items-images-object-detection
9393
description: Fridge-items images Object detection
94-
path: azureml://subscriptions/<my-subscription-id>/resourcegroups/<my-resource-group>/workspaces/<my-workspace>/datastores/<my-datastore>/paths/<path_to_image_data_folder>
94+
path: azureml://subscriptions/<fmy-subscription-id>/resourcegroups/<my-resource-group>/workspaces/<my-workspace>/datastores/<my-datastore>/paths/<path_to_image_data_folder>
9595
type: uri_folder
9696
```
9797
@@ -112,6 +112,12 @@ Next, you will need to get the label annotations in JSONL format. The schema of
112112

113113
If your training data is in a different format (like, pascal VOC or COCO), [helper scripts](https://github.com/Azure/azureml-examples/blob/main/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert the data to JSONL are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/sdk-preview/sdk/jobs/automl-standalone-jobs).
114114

115+
116+
### Using pre-labeled training data from Azure Blob storage
117+
If you have your labelled training data present in a container in Azure Blob storage, then you can access it directly from there by [creating a datastore referring to that container](how-to-prepare-datasets-for-automl-images.md#create-an-azure-blob-datastore). Once you have created a datastore in AML workspace, linked to a existing container in blob, you'll have to update authentication details for that datastore. You'll have to select subscription id, resource group and provide either Account Key or SAS token.
118+
119+
![Update Authentication for Datastore.](media/how-to-prepare-datasets-for-automl-images/update-datastore-authentication.png)
120+
115121
## Create MLTable
116122

117123
Once you have your labeled data in JSONL format, you can use it to create `MLTable` as shown below. MLtable packages your data into a consumable object for training.

0 commit comments

Comments
 (0)