You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/component-reference/component-reference.md
+5-6Lines changed: 5 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: "Algorithm & component reference"
3
-
description: Learn about the Azure Machine Learning designer components you can use to create your own machine learning projects.
3
+
description: Learn about the Azure Machine Learning designer components that you can use to create your own machine learning projects.
4
4
titleSuffix: Azure Machine Learning
5
5
services: machine-learning
6
6
ms.service: machine-learning
@@ -31,7 +31,7 @@ For help with choosing algorithms, see
31
31
| --- |--- | --- |
32
32
| Data Input and Output | Move data from cloud sources into your pipeline. Write your results or intermediate data to Azure Storage, SQL Database, or Hive, while running a pipeline, or use cloud storage to exchange data between pipelines. |[Enter Data Manually](enter-data-manually.md) <br/> [Export Data](export-data.md) <br/> [Import Data](import-data.md)|
33
33
| Data Transformation | Operations on data that are unique to machine learning, such as normalizing or binning data, dimensionality reduction, and converting data among various file formats.| [Add Columns](add-columns.md) <br/> [Add Rows](add-rows.md) <br/> [Apply Math Operation](apply-math-operation.md) <br/> [Apply SQL Transformation](apply-sql-transformation.md) <br/> [Clean Missing Data](clean-missing-data.md) <br/> [Clip Values](clip-values.md) <br/> [Convert to CSV](convert-to-csv.md) <br/> [Convert to Dataset](convert-to-dataset.md) <br/> [Convert to Indicator Values](convert-to-indicator-values.md) <br/> [Edit Metadata](edit-metadata.md) <br/> [Group Data into Bins](group-data-into-bins.md) <br/> [Join Data](join-data.md) <br/> [Normalize Data](normalize-data.md) <br/> [Partition and Sample](partition-and-sample.md) <br/> [Remove Duplicate Rows](remove-duplicate-rows.md) <br/> [SMOTE](smote.md) <br/> [Select Columns Transform](select-columns-transform.md) <br/> [Select Columns in Dataset](select-columns-in-dataset.md) <br/> [Split Data](split-data.md) |
34
-
| Feature Selection | Select a subset of relevant, useful features to use in building an analytical model. |[Filter Based Feature Selection](filter-based-feature-selection.md) <br/> [Permutation Feature Importance](permutation-feature-importance.md)|
34
+
| Feature Selection | Select a subset of relevant, useful features to use to build an analytical model. |[Filter Based Feature Selection](filter-based-feature-selection.md) <br/> [Permutation Feature Importance](permutation-feature-importance.md)|
35
35
| Statistical Functions | Provide a wide variety of statistical methods related to data science. |[Summarize Data](summarize-data.md)|
36
36
37
37
## Machine learning algorithms
@@ -40,7 +40,7 @@ For help with choosing algorithms, see
Learn about the [web service components](web-service-input-output.md) which are necessary for real-time inference in Azure Machine Learning designer.
60
+
Learn about the [web service components](web-service-input-output.md), which are necessary for real-time inference in Azure Machine Learning designer.
62
61
63
62
## Error messages
64
63
65
-
Learn about the [error messages and exception codes](designer-error-codes.md) you might encounter using components in Azure Machine Learning designer.
64
+
Learn about the [error messages and exception codes](designer-error-codes.md)that you might encounter using components in Azure Machine Learning designer.
Copy file name to clipboardExpand all lines: articles/machine-learning/component-reference/import-data.md
+14-13Lines changed: 14 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: "Import Data: Component Reference"
3
3
titleSuffix: Azure Machine Learning
4
-
description: Learn how to use the Import Data component in Azure Machine Learning to load data into a machine learning pipeline from existing cloud data services.
4
+
description: Learn how to use the Import Data component in Azure Machine Learning to load data into a machine learning pipeline from existing cloud data services.
5
5
services: machine-learning
6
6
ms.service: machine-learning
7
7
ms.subservice: core
@@ -33,7 +33,7 @@ The **Import Data** component support read data from following sources:
33
33
- Azure SQL Database
34
34
- Azure PostgreSQL
35
35
36
-
Before using cloud storage, you need to register a datastore in your Azure Machine Learning workspace first. For more information, see [How to Access Data](../how-to-access-data.md).
36
+
Before using cloud storage, you must register a datastore in your Azure Machine Learning workspace first. For more information, see [How to Access Data](../how-to-access-data.md).
37
37
38
38
After you define the data you want and connect to the source, **[Import Data](./import-data.md)** infers the data type of each column based on the values it contains, and loads the data into your designer pipeline. The output of **Import Data** is a dataset that can be used with any designer pipeline.
39
39
@@ -51,8 +51,9 @@ If your source data changes, you can refresh the dataset and add new data by rer
51
51
52
52
1. Select **Data source**, and choose the data source type. It could be HTTP or datastore.
53
53
54
-
If you choose datastore, you can select existing datastores that already registered to your Azure Machine Learning workspace or create a new datastore. Then define the path of data to import in the datastore. You can easily browse the path by click **Browse Path**
55
-

54
+
If you choose datastore, you can select existing datastores that are already registered to your Azure Machine Learning workspace or create a new datastore. Then define the path of data to import in the datastore. You can easily browse the path by selecting **Browse Path**.
55
+
56
+
:::image type="content" source="media/module/import-data-path.png" alt-text="Screenshot shows the Browse path link which opens the Path selection dialog box." lightbox ="media/module/import-data-path.png":::
56
57
57
58
> [!NOTE]
58
59
> **Import Data** component is for **Tabular** data only.
@@ -64,13 +65,13 @@ If your source data changes, you can refresh the dataset and add new data by rer
64
65
65
66
1. Select the preview schema to filter the columns you want to include. You can also define advanced settings like Delimiter in Parsing options.
:::image type="content" source="media/module/import-data.png" alt-text="Screenshot of the schema preview with Column 3, 4, 5 and 6 selected.":::
68
69
69
-
1. The checkbox, **Regenerate output**, decides whether to execute the component to regenerate output at running time.
70
+
1. The checkbox, **Regenerate output**, decides whether to execute the component to regenerate output at running time.
70
71
71
-
It's by default unselected, which means if the component has been executed with the same parameters previously, the system will reuse the output from last run to reduce run time.
72
+
It's by default unselected, which means if the component has been executed with the same parameters previously, the system reuses the output from last run to reduce run time.
72
73
73
-
If it is selected, the system will execute the component again to regenerate output. So select this option when underlying data in storage is updated, it can help to get the latest data.
74
+
If it is selected, the system executes the component again to regenerate output. So select this option when underlying data in storage is updated, it can help to get the latest data.
74
75
75
76
76
77
1. Submit the pipeline.
@@ -85,20 +86,20 @@ If your source data changes, you can refresh the dataset and add new data by rer
85
86
86
87
When import completes, right-click the output dataset and select **Visualize** to see if the data was imported successfully.
87
88
88
-
If you want to save the data for reuse, rather than importing a new set of data each time the pipeline is run, select the **Register dataset** icon under the **Outputs+logs** tab in the right panel of the component. Choose a name for the dataset. The saved dataset preserves the data at the time of saving, the dataset is not updated when the pipeline is rerun, even if the dataset in the pipeline changes. This can be useful for taking snapshots of data.
89
+
If you want to save the data for reuse, rather than importing a new set of data each time the pipeline is run, select the **Register dataset** icon under the **Outputs+logs** tab in the right panel of the component. Choose a name for the dataset. The saved dataset preserves the data at the time of saving. The dataset is not updated when the pipeline is rerun, even if the dataset in the pipeline changes. This can be useful for taking snapshots of data.
89
90
90
-
After importing the data, it might need some additional preparations for modeling and analysis:
91
+
After you import the data, it might need some additional preparations for modeling and analysis:
91
92
92
-
- Use [Edit Metadata](./edit-metadata.md) to change column names, to handle a column as a different data type, or to indicate that some columns are labels or features.
93
+
- Use [Edit Metadata](./edit-metadata.md) to change column names, handle a column as a different data type, or indicate that some columns are labels or features.
93
94
94
95
- Use [Select Columns in Dataset](./select-columns-in-dataset.md) to select a subset of columns to transform or use in modeling. The transformed or removed columns can easily be rejoined to the original dataset by using the [Add Columns](./add-columns.md) component.
95
96
96
97
- Use [Partition and Sample](./partition-and-sample.md) to divide the dataset, perform sampling, or get the top n rows.
97
98
98
99
## Limitations
99
100
100
-
Due to datstore access limitation, if your inference pipeline contains **Import Data** component, it will be auto-removed when deploy to real-time endpoint.
101
+
Due to datastore access limitation, if your inference pipeline contains **Import Data** component, it is auto-removed when deployed to real-time endpoint.
101
102
102
103
## Next steps
103
104
104
-
See the [set of components available](component-reference.md) to Azure Machine Learning.
105
+
See the [set of components available](component-reference.md) to Azure Machine Learning.
0 commit comments