Skip to content

Commit de72375

Browse files
committed
PM edits
1 parent 86306bd commit de72375

File tree

1 file changed

+11
-44
lines changed

1 file changed

+11
-44
lines changed

articles/machine-learning/how-to-use-labeled-dataset.md

Lines changed: 11 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@ ms.service: machine-learning
88
ms.subservice: mldata
99
ms.topic: how-to
1010
ms.custom: data4ml
11-
ms.date: 10/21/2021
11+
ms.date: 02/15/2022
1212

1313
# Customer intent: As an experienced Python developer, I need to export my data labels and use them for machine learning tasks.
1414
---
1515

1616
# Create and explore Azure Machine Learning dataset with labels
1717

18-
In this article, you'll learn how to export the data labels from an Azure Machine Learning data labeling project and load them into popular formats such as, a pandas dataframe for data exploration or a Torchvision dataset for image transformation.
18+
In this article, you'll learn how to export the data labels from an Azure Machine Learning data labeling project and load them into popular formats such as, a pandas dataframe for data exploration.
1919

2020
## What are datasets with labels
2121

@@ -25,7 +25,6 @@ We refer to Azure Machine Learning datasets with labels as labeled datasets. The
2525

2626
* An Azure subscription. If you don’t have an Azure subscription, create a [free account](https://azure.microsoft.com/free/) before you begin.
2727
* The [Azure Machine Learning SDK for Python](/python/api/overview/azure/ml/intro), or access to [Azure Machine Learning studio](https://ml.azure.com/).
28-
* Install the [azure-contrib-dataset](/python/api/azureml-contrib-dataset/) package
2928
* A Machine Learning workspace. See [Create an Azure Machine Learning workspace](how-to-manage-workspace.md).
3029
* Access to an Azure Machine Learning data labeling project. If you don't have a labeling project, first create one for [image labeling](how-to-create-image-labeling-projects.md) or [text labeling](how-to-create-text-labeling-projects.md).
3130

@@ -50,34 +49,31 @@ Once you have exported your labeled data to an Azure Machine Learning dataset, y
5049

5150
## Explore labeled datasets
5251

53-
Load your labeled datasets into a pandas dataframe or Torchvision dataset to leverage popular open-source libraries for data exploration, as well as PyTorch provided libraries for image transformation and training.
52+
Load your labeled datasets into a pandas dataframe to leverage popular open-source libraries for data exploration.
5453

5554
### Pandas dataframe
5655

57-
You can load labeled datasets into a pandas dataframe with the [`to_pandas_dataframe()`](/python/api/azureml-core/azureml.data.tabulardataset#to-pandas-dataframe-on-error--null---out-of-range-datetime--null--) method from the `azureml-contrib-dataset` class. Install the class with the following shell command:
56+
You can load labeled datasets into a pandas dataframe with the [`to_pandas_dataframe()`](/python/api/azureml-core/azureml.data.tabulardataset#to-pandas-dataframe-on-error--null---out-of-range-datetime--null--) method from the `azureml-dataprep` class. Install the class with the following shell command:
5857

5958
```shell
60-
pip install azureml-contrib-dataset
59+
pip install azureml-dataprep
6160
```
6261

63-
>[!NOTE]
64-
>The azureml.contrib namespace changes frequently, as we work to improve the service. As such, anything in this namespace should be considered as a preview, and not fully supported by Microsoft.
65-
66-
Azure Machine Learning offers the following file handling options for file streams when converting to a pandas dataframe.
67-
* Download: Download your data files to a local path.
68-
* Mount: Mount your data files to a mount point. Mount only works for Linux-based compute, including Azure Machine Learning notebook VM and Azure Machine Learning Compute.
62+
The exported dataset is a TabularDataset. If you plan to use download() or moint() methods, be sure to set the parameter `stream column ='<image_url>'`
6963

7064
In the following code, the `animal_labels` dataset is the output from a labeling project previously saved to the workspace.
7165

7266
```Python
7367
import azureml.core
74-
import azureml.contrib.dataset
7568
from azureml.core import Dataset, Workspace
76-
from azureml.contrib.dataset import FileHandlingOption
69+
7770

7871
# get animal_labels dataset from the workspace
7972
animal_labels = Dataset.get_by_name(workspace, 'animal_labels')
80-
animal_pd = animal_labels.to_pandas_dataframe(file_handling_option=FileHandlingOption.DOWNLOAD, target_path='./download/', overwrite_download=True)
73+
animal_pd = animal_labels.to_pandas_dataframe()
74+
75+
# download the imagess to local
76+
animal_labels.download('<image_url>')
8177

8278
import matplotlib.pyplot as plt
8379
import matplotlib.image as mpimg
@@ -87,33 +83,4 @@ img = mpimg.imread(animal_pd.loc[0,'image_url'])
8783
imgplot = plt.imshow(img)
8884
```
8985

90-
### Torchvision datasets
91-
92-
You can load labeled datasets into Torchvision dataset with the [to_torchvision()](/python/api/azureml-contrib-dataset/azureml.contrib.dataset.tabulardataset#to-torchvision--) method also from the `azureml-contrib-dataset` class. To use this method, you need to have [PyTorch](https://pytorch.org/) installed.
93-
94-
In the following code, the `animal_labels` dataset is the output from a labeling project previously saved to the workspace.
95-
96-
```python
97-
import azureml.core
98-
import azureml.contrib.dataset
99-
from azureml.core import Dataset, Workspace
100-
from azureml.contrib.dataset import FileHandlingOption
101-
102-
from torchvision.transforms import functional as F
103-
104-
# get animal_labels dataset from the workspace
105-
animal_labels = Dataset.get_by_name(workspace, 'animal_labels')
106-
107-
# load animal_labels dataset into torchvision dataset
108-
pytorch_dataset = animal_labels.to_torchvision()
109-
img = pytorch_dataset[0][0]
110-
print(type(img))
111-
112-
# use methods from torchvision to transform the img into grayscale
113-
pil_image = F.to_pil_image(img)
114-
gray_image = F.to_grayscale(pil_image, num_output_channels=3)
115-
116-
imgplot = plt.imshow(gray_image)
117-
```
118-
11986
## Next steps

0 commit comments

Comments
 (0)