You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-data-ingestion.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,10 +15,10 @@ ms.date: 02/25/2020
15
15
16
16
# Data ingestion in Azure Machine Learning
17
17
18
-
In this article, you learn about the pros and cons of the following data ingestion options available with Azure Machine Learning. Depending on what data ingestion tasks you have, these options can be used individually for your data ingestion tasks, or together as part of your overall data ingestion workflow.
18
+
In this article, you learn about the pros and cons of the following data ingestion options available with Azure Machine Learning. Depending on your data and data ingestion needs, you can use these options separately, or together as part of your overall data ingestion workflow.
19
19
20
20
1.[Azure Data Factory](#use-azure-data-factory) pipelines
@@ -38,7 +38,7 @@ Natively supports data source triggered data ingestion| Expensive to construct a
38
38
Integrated with various Azure tools like [Azure Databricks](https://docs.microsoft.com/azure/data-factory/transform-data-using-databricks-notebook) and [Azure Functions](https://docs.microsoft.com/azure/data-factory/control-flow-azure-function-activity) | Doesn't natively run scripts, instead relies on separate compute for script runs
39
39
Data preparation and model training processes are separate.|
40
40
Embedded data lineage capability for Azure Data Factory dataflows|
41
-
Provides a low code experience user interface for non-scripting approaches |
41
+
Provides a low code experience [user interface](https://docs.microsoft.com/azure/data-factory/quickstart-create-data-factory-portal) for non-scripting approaches |
42
42
43
43
These steps and the following diagram illustrate Azure Data Factory's data ingestion workflow.
44
44
@@ -50,9 +50,9 @@ These steps and the following diagram illustrate Azure Data Factory's data inges
50
50
51
51
## Use the Python SDK
52
52
53
-
With the [Python SDK](https://docs.microsoft.com/python/api/overview/azureml-sdk/?view=azure-ml-py), you can incorporate data ingestion tasks, such as simple data transformations with Python, into an [Azure Machine Learning pipeline](how-to-create-your-first-pipeline.md) step.
53
+
With the [Python SDK](https://docs.microsoft.com/python/api/overview/azureml-sdk/?view=azure-ml-py), you can incorporate data ingestion tasks into an [Azure Machine Learning pipeline](how-to-create-your-first-pipeline.md) step.
54
54
55
-
The following table summarizes the pros and con for using the SDK for data ingestion tasks.
55
+
The following table summarizes the pros and con for using the SDK and an ML pipelines step for data ingestion tasks.
56
56
57
57
Pros| Cons
58
58
---|---
@@ -61,7 +61,7 @@ Data preparation as part of every model training execution|Requires development
61
61
||Requires engineering practices to guarantee code quality and effectiveness
62
62
||Does not provide a user interface for creating the ingestion mechanism
63
63
64
-
In the following diagram, the Azure Machine Learning pipeline consists of two steps: data ingestion and model training. Where the data ingestion step encompasses tasks that can be accomplished using Python libraries and the SDK, such as extracting the data from local/web sources, and data transformations, like missing value imputation. The training step then uses the prepared data as input to train your machine learning model.
64
+
In the following diagram, the Azure Machine Learning pipeline consists of two steps: data ingestion and model training. The data ingestion step encompasses tasks that can be accomplished using Python libraries and the SDK, such as extracting the data from local/web sources, and basic data transformations, like missing value imputation. The training step then uses the prepared data as input to train your machine learning model.
65
65
66
66

0 commit comments