Skip to content

Commit 0efd0f7

Browse files
committed
data asset using jsnol
2 parents 55a95cd + aa2bd0e commit 0efd0f7

File tree

3 files changed

+36
-8
lines changed

3 files changed

+36
-8
lines changed

articles/machine-learning/how-to-prepare-datasets-for-automl-images.md

Lines changed: 36 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,39 @@ It helps to create, manage, and monitor data labeling tasks for
4747

4848
If you already have a data labeling project and you want to use that data, you can [export your labeled data as an Azure ML Dataset](how-to-create-image-labeling-projects.md#export-the-labels) and then access the dataset under 'Datasets' tab in Azure ML Studio. This exported dataset can then be passed as an input using `azureml:<tabulardataset_name>:<version>` format. Here is an example on how to pass existing dataset as input for training computer vision models.
4949

50+
# [Azure CLI](#tab/cli)
51+
52+
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
53+
54+
```yaml
55+
training_data:
56+
path: azureml:odFridgeObjectsTrainingDataset:1
57+
type: mltable
58+
mode: direct
59+
```
60+
61+
# [Python SDK](#tab/python)
62+
63+
[!INCLUDE [sdk v2](../../includes/machine-learning-sdk-v2.md)]
64+
65+
```python
66+
from azure.ai.ml.constants import AssetTypes, InputOutputModes
67+
from azure.ai.ml import Input
68+
69+
# Training MLTable with v1 TabularDataset
70+
my_training_data_input = Input(
71+
type=AssetTypes.MLTABLE, path="azureml:odFridgeObjectsTrainingDataset:1",
72+
mode=InputOutputModes.DIRECT
73+
)
74+
```
75+
---
76+
5077
### Using pre-labeled training data from local machine
51-
If you have previously labeled data that you would like to use to train your model, you will first need to upload the images to the default Azure Blob Storage of your Azure ML Workspace and register it as a data asset.
78+
If you have previously labeled data that you would like to use to train your model, you will first need to upload the images to the default Azure Blob Storage of your Azure ML Workspace and register it as a [data asset](how-to-create-data-assets.md).
5279

53-
Below scripts uploads the image data on your local machine at path "./data/odFridgeObjects" to datastore in Azure Blob Storage. Thereafter, it creates a new data asset with the name "fridge-items-images-object-detection" in your Azure ML Workspace.
80+
The following script uploads the image data on your local machine at path "./data/odFridgeObjects" to datastore in Azure Blob Storage. It then creates a new data asset with the name "fridge-items-images-object-detection" in your Azure ML Workspace.
5481

55-
If there already exists a data asset with name "fridge-items-images-object-detection" in your Azure ML Workspace, then it'll update its version number of data asset and make it point to new location in datastore in Azure Blob Storage where we uploaded the image data.
82+
If there already exists a data asset with the name "fridge-items-images-object-detection" in your Azure ML Workspace, it will update the version number of the data asset and point it to the new location where the image data uploaded.
5683

5784
# [Azure CLI](#tab/cli)
5885
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
@@ -85,7 +112,7 @@ az ml data create -f [PATH_TO_YML_FILE] --workspace-name [YOUR_AZURE_WORKSPACE]
85112

86113
---
87114

88-
If you already have your data present in an existing datastore and want to create data asset out of it, you can do so by providing path to the data in datastore as shown below, instead of providing path on your local machine.
115+
If you already have your data present in an existing datastore and want to create a data asset out of it, you can do so by providing the path to the data in the datastore, instead of providing the path of your local machine. Update the code [above](how-to-prepare-datasets-for-automl-images.md#using-pre-labeled-training-data-from-local-machine) with the following snippet.
89116

90117
# [Azure CLI](#tab/cli)
91118
[!INCLUDE [cli v2](../../includes/machine-learning-cli-v2.md)]
@@ -96,7 +123,7 @@ Create a .yml file with the following configuration.
96123
$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
97124
name: fridge-items-images-object-detection
98125
description: Fridge-items images Object detection
99-
path: azureml://subscriptions/<fmy-subscription-id>/resourcegroups/<my-resource-group>/workspaces/<my-workspace>/datastores/<my-datastore>/paths/<path_to_image_data_folder>
126+
path: azureml://subscriptions/<my-subscription-id>/resourcegroups/<my-resource-group>/workspaces/<my-workspace>/datastores/<my-datastore>/paths/<path_to_image_data_folder>
100127
type: uri_folder
101128
```
102129
@@ -122,11 +149,12 @@ Next, you will need to get the label annotations in JSONL format. The schema of
122149

123150
If your training data is in a different format (like, pascal VOC or COCO), [helper scripts](https://github.com/Azure/azureml-examples/blob/v2samplesreorg/v1/python-sdk/tutorials/automl-with-azureml/image-object-detection/coco2jsonl.py) to convert the data to JSONL are available in [notebook examples](https://github.com/Azure/azureml-examples/blob/v2samplesreorg/sdk/python/jobs/automl-standalone-jobs).
124151

152+
once you have created jsonl file following the above steps, you can register it as a data asset using UI. Make sure you select `stream` type in schema section which is described in below animation.
125153

126-
### Using pre-labeled training data from Azure Blob storage
127-
If you have your labelled training data present in a container in Azure Blob storage, then you can access it directly from there by [creating a datastore referring to that container](how-to-datastore.md#create-an-azure-blob-datastore). Once you have created a datastore in AML workspace, linked to a existing container in blob, you'll have to update authentication details for that datastore. You'll have to select subscription id, resource group and provide either Account Key or SAS token.
154+
![Animation showing how to register a data asset from the jsonl files](media\how-to-prepare-datasets-for-automl-images\ui-dataset-jsnol.gif)
128155

129-
![Update Authentication for Datastore.](media/how-to-prepare-datasets-for-automl-images/update-datastore-authentication.png)
156+
### Using pre-labeled training data from Azure Blob storage
157+
If you have your labelled training data present in a container in Azure Blob storage, then you can access it directly from there by [creating a datastore referring to that container](how-to-datastore.md#create-an-azure-blob-datastore).
130158

131159
## Create MLTable
132160

2.58 MB
Loading

0 commit comments

Comments
 (0)