Skip to content

Commit 432bb9f

Browse files
committed
performance + benefits
1 parent 6a751a1 commit 432bb9f

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

articles/machine-learning/concept-data.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,13 +64,19 @@ Supported cloud-based storage services in Azure that can be registered as datast
6464

6565
## Datasets
6666

67-
Azure Machine Learning datasets are references that point to the data in your storage service. They aren't copies of your data, so no extra storage cost is incurred and the integrity of your original data sources aren't at risk.
67+
Azure Machine Learning datasets are references that point to the data in your storage service. They aren't copies of your dataBy creating an Azure Machine Learning dataset, you create a reference to the data source location, along with a copy of its metadata.
6868

69-
To interact with your data in storage, [create a dataset](how-to-create-register-datasets.md) to package your data into a consumable object for machine learning tasks. Register the dataset to your workspace to share and reuse it across different experiments without data ingestion complexities.
69+
Because datasets are lazily evaluated, and the data remains in its existing location, you
70+
71+
* Incur no extra storage cost.
72+
* Don't risk unintentionally changing your original data sources.
73+
* Improve ML workflow performance speeds.
74+
75+
To interact with your data in storage, [create a dataset](how-to-create-register-datasets.md) to package your data into a consumable object for machine learning tasks. Register the dataset to your workspace to share and reuse it across different experiments without data ingestion complexities.
7076

7177
Datasets can be created from local files, public urls, [Azure Open Datasets](https://azure.microsoft.com/services/open-datasets/), or Azure storage services via datastores.
7278

73-
We support 2 types of datasets:
79+
There are 2 types of datasets:
7480

7581
+ A [FileDataset](https://docs.microsoft.com/python/api/azureml-core/azureml.data.file_dataset.filedataset?view=azure-ml-py) references single or multiple files in your datastores or public URLs. If your data is already cleansed and ready to use in training experiments, you can [download or mount files](how-to-train-with-datasets.md#mount-files-to-remote-compute-targets) referenced by FileDatasets to your compute target.
7682

0 commit comments

Comments
 (0)