Skip to content

Commit 816cd06

Browse files
committed
peer feedback
1 parent 14a5f38 commit 816cd06

File tree

1 file changed

+17
-7
lines changed

1 file changed

+17
-7
lines changed

articles/machine-learning/service/concept-data.md

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -28,31 +28,41 @@ When you're ready to use the data in your storage, we recommend you
2828

2929
1. Consuming it directly in Azure Machine Learning solutions like automated machine learning (automated ML) experiment runs, machine learning pipelines, and the designer.
3030
4. Create dataset monitors for your model input and output datasets to detect for data drift.
31-
5. If data drift is detected, retrain your model accordingly.
31+
5. If data drift is detected, update your dataset and retrain your model accordingly.
3232

3333
The following diagram provides a visual demonstration of this recommended data access workflow.
3434

3535
![Data-concept-diagram](media/concept-data/data-concept-diagram.svg)
3636

3737
## Access data in storage
3838

39-
To access your data in your storage account, Azure Machine Learning offers datastores and datasets. Datastores provide a layer of abstraction over your storage service, this aids in security and ease of access to your storage, since connection information is kept in the datastore and not exposed in scripts. Datasets point to the specific file or files in your underlying storage that you want to use for your machine learning experiment. Together these offer a secure, scalable, and reproducible data delivery workflow for your machine learning tasks.
39+
To access your data in your storage account, Azure Machine Learning offers datastores and datasets. Datastores provide a layer of abstraction over your storage service, this aids in security and ease of access to your storage, since connection information is kept in the datastore and not exposed in scripts. Datasets point to the specific file or files in your underlying storage that you want to use for your machine learning experiment. Together, datastores and datasets offer a secure, scalable, and reproducible data delivery workflow for your machine learning tasks.
4040

4141
### Datastores
4242

43-
An Azure Machine Learning datastore is a storage abstraction over an Azure storage services account. [Register and create a datastore](how-to-access-data.md) to easily connect to your Azure storage account, and access the data in your underlying Azure storage services.
43+
An Azure Machine Learning datastore is a storage abstraction over your Azure storage services. [Register and create a datastore](how-to-access-data.md) to easily connect to your Azure storage account, and access the data in your underlying Azure storage services.
44+
45+
Supported Azure storage services that can be registered as datastores:
46+
+ Azure Blob Container
47+
+ Azure File Share
48+
+ Azure Data Lake
49+
+ Azure Data Lake Gen2
50+
+ Azure SQL Database
51+
+ Azure Database for PostgreSQL
52+
+ Databricks File System
53+
+ Azure Database for MySQL
4454

4555
### Datasets
4656

47-
[Create an Azure Machine Learning dataset](how-to-create-register-datasets.md) to interact with data in your datastores or to package your data into a consumable object for machine learning tasks. Register the dataset to your workspace to share and reuse it across different experiments without data ingestion complexities.
57+
[Create an Azure Machine Learning dataset](how-to-create-register-datasets.md) to interact with data in your datastores and package your data into a consumable object for machine learning tasks. Register the dataset to your workspace to share and reuse it across different experiments without data ingestion complexities.
4858

4959
Datasets can be created from local files, public urls, [Azure Open Datasets](#open), or specific file(s) in your datastores. They aren't copies of your data, but are references that point to the data in your storage service, so no extra storage cost is incurred.
5060

5161
The following diagram shows that if you don't have an Azure storage service, you can create a dataset directly from local files, public urls, or an Azure Open Dataset. Doing so connects your dataset to the default datastore automatically created with your experiment's [Azure Machine Learning workspace](concept-workspace.md).
5262

5363
![Data-concept-diagram](media/concept-data/dataset-workflow.svg)
5464

55-
Additional datasets capabilities can be found in the following articles.
65+
Additional datasets capabilities can be found in the following documentation:
5666

5767
+ [Version and track](how-to-version-track-datasets.md) dataset lineage.
5868
+ [Monitor your dataset](how-to-monitor-datasets.md) to help with data drift detection.
@@ -68,8 +78,8 @@ With datasets, you can accomplish a number of machine learning tasks through sea
6878
+ [Train machine learning models](how-to-train-with-datasets.md).
6979
+ Consume datasets in
7080
+ [automated ML experiments](how-to-create-portal-experiments.md)
71-
+ [machine learning pipelines](how-to-create-your-first-pipeline.md)
72-
+ the [designer](tutorial-designer-automobile-price-train-score.md#import-data)
81+
+ the [designer](tutorial-designer-automobile-price-train-score.md#import-data)
82+
+ Access datasets for scoring with batch inference in [machine learning pipelines](how-to-create-your-first-pipeline.md).
7383
+ Create a [data labeling project](#label).
7484
+ Set up a dataset monitor for [data drift](#drift) detection.
7585

0 commit comments

Comments
 (0)