Skip to content

Commit 44b6e57

Browse files
Merge pull request #284973 from Blackmist/208346-fresh
freshness
2 parents a5f3172 + a1523e7 commit 44b6e57

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

articles/machine-learning/v1/how-to-data-ingest-adf.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.author: larryfr
99
author: Blackmist
1010
manager: davete
1111
ms.reviewer: iefedore
12-
ms.date: 08/17/2022
12+
ms.date: 08/19/2024
1313
ms.topic: how-to
1414
ms.custom: UpdateFrequency5, devx-track-python, data4ml, sdkv1
1515
#Customer intent: As an experienced data engineer, I need to create a production data ingestion pipeline for the data used to train my models.
@@ -18,7 +18,7 @@ ms.custom: UpdateFrequency5, devx-track-python, data4ml, sdkv1
1818

1919
# Data ingestion with Azure Data Factory
2020

21-
In this article, you learn about the available options for building a data ingestion pipeline with [Azure Data Factory](../../data-factory/introduction.md). This Azure Data Factory pipeline is used to ingest data for use with [Azure Machine Learning](../overview-what-is-azure-machine-learning.md). Data Factory allows you to easily extract, transform, and load (ETL) data. Once the data has been transformed and loaded into storage, it can be used to train your machine learning models in Azure Machine Learning.
21+
In this article, you learn about the available options for building a data ingestion pipeline with [Azure Data Factory](../../data-factory/introduction.md). This Azure Data Factory pipeline is used to ingest data for use with [Azure Machine Learning](../overview-what-is-azure-machine-learning.md). Data Factory allows you to easily extract, transform, and load (ETL) data. Once the data is transformed and loaded into storage, it can be used to train your machine learning models in Azure Machine Learning.
2222

2323
Simple data transformation can be handled with native Data Factory activities and instruments such as [data flow](../../data-factory/control-flow-execute-data-flow-activity.md). When it comes to more complicated scenarios, the data can be processed with some custom code. For example, Python or R code.
2424

@@ -43,7 +43,7 @@ The function is invoked with the [Azure Data Factory Azure Function activity](..
4343

4444
* Advantages:
4545
* The data is processed on a serverless compute with a relatively low latency
46-
* Data Factory pipeline can invoke a [Durable Azure Function](../../azure-functions/durable/durable-functions-overview.md) that may implement a sophisticated data transformation flow
46+
* Data Factory pipeline can invoke a [Durable Azure Function](../../azure-functions/durable/durable-functions-overview.md) that can implement a sophisticated data transformation flow
4747
* The details of the data transformation are abstracted away by the Azure Function that can be reused and invoked from other places
4848
* Disadvantages:
4949
* The Azure Functions must be created before use with ADF
@@ -95,7 +95,7 @@ This method is recommended for [Machine Learning Operations (MLOps) workflows](c
9595
Each time the Data Factory pipeline runs,
9696

9797
1. The data is saved to a different location in storage.
98-
1. To pass the location to Azure Machine Learning, the Data Factory pipeline calls an [Azure Machine Learning pipeline](../concept-ml-pipelines.md). When calling the ML pipeline, the data location and job ID are sent as parameters.
98+
1. To pass the location to Azure Machine Learning, the Data Factory pipeline calls an [Azure Machine Learning pipeline](../concept-ml-pipelines.md). When the Data Factory pipeline calls the Azure Machine Learning pipeline, the data location and job ID are sent as parameters.
9999
1. The ML pipeline can then create an Azure Machine Learning datastore and dataset with the data location. Learn more in [Execute Azure Machine Learning pipelines in Data Factory](../../data-factory/transform-data-machine-learning-service.md).
100100

101101
![Diagram shows an Azure Data Factory pipeline and an Azure Machine Learning pipeline and how they interact with raw data and prepared data. The Data Factory pipeline feeds data to the Prepared Data database, which feeds a data store, which feeds datasets in the Machine Learning workspace.](media/how-to-data-ingest-adf/aml-dataset.png)
@@ -136,7 +136,7 @@ adlsgen2_datastore = Datastore.register_azure_data_lake_gen2(
136136
client_id=client_id, # client id of service principal
137137
```
138138

139-
Next, create a dataset to reference the file(s) you want to use in your machine learning task.
139+
Next, create a dataset to reference the files you want to use in your machine learning task.
140140

141141
The following code creates a TabularDataset from a csv file, `prepared-data.csv`. Learn more about [dataset types and accepted file formats](how-to-create-register-datasets.md#dataset-types).
142142

0 commit comments

Comments
 (0)