You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-automated-ml.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -84,7 +84,7 @@ In every automated machine learning experiment, your data is automatically scale
84
84
|Scaling & normalization| Description |
85
85
| ------------- | ------------- |
86
86
|[StandardScaleWrapper](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)| Standardize features by removing the mean and scaling to unit variance |
87
-
|[MinMaxScalar](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html)| Transforms features by scaling each feature by that column’s minimum and maximum |
87
+
|[MinMaxScalar](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html)| Transforms features by scaling each feature by that column's minimum and maximum |
88
88
|[MaxAbsScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler)|Scale each feature by its maximum absolute value |
89
89
|[RobustScalar](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html)|This Scaler features by their quantile range |
90
90
|[PCA](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)|Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space |
@@ -93,10 +93,10 @@ In every automated machine learning experiment, your data is automatically scale
Additional advanced preprocessing and featurization are also available, such as data guardrails, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#featurization).
96
+
Additional advanced preprocessing and featurization are also available, such as data guardrails, encoding, and transforms. [Learn more about what featurization is included](how-to-use-automated-ml-for-ml-models.md#featurization).
97
97
Enable this setting with:
98
98
99
-
+ Azure Machine Learning studio: Enable **Automatic featurization** in the **View additional configuration** section [with these steps](how-to-create-portal-experiments.md#create-and-run-experiment).
99
+
+ Azure Machine Learning studio: Enable **Automatic featurization** in the **View additional configuration** section [with these steps](how-to-use-automated-ml-for-ml-models.md#create-and-run-experiment).
100
100
101
101
+ Python SDK: Specifying `"feauturization": 'auto' / 'off' / 'FeaturizationConfig'` for the [`AutoMLConfig` class](/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig).
102
102
@@ -153,17 +153,17 @@ Model **C** represents a clear case of over-fitting; the training accuracy is ve
153
153
154
154
## Classification & regression
155
155
156
-
Classification and regression are the most common types of machine learning tasks. Both are types of supervised learning in which models learn using training data, and apply those learnings to new data. Azure Machine Learning offers featurizations specifically for these tasks, such as deep neural network text featurizers for classification. Learn more about [featurization options](how-to-create-portal-experiments.md#featurization).
156
+
Classification and regression are the most common types of machine learning tasks. Both are types of supervised learning in which models learn using training data, and apply those learnings to new data. Azure Machine Learning offers featurizations specifically for these tasks, such as deep neural network text featurizers for classification. Learn more about [featurization options](how-to-use-automated-ml-for-ml-models.md#featurization).
157
157
158
158
The main goal of classification models is to predict which categories new data will fall into based on learnings from its training data. Common classification examples include fraud detection, handwriting recognition, and object detection. Learn more and see an example of [classification with automated machine learning](tutorial-train-models-with-aml.md).
159
159
160
160
Different from classification where predicted output values are categorical, regression models predict numerical output values based on independent predictors. In regression, the objective is to help establish the relationship among those independent predictor variables by estimating how one variable impacts the others. For example, automobile price based on features like, gas mileage, safety rating, etc. Learn more and see an example of [regression with automated machine learning](tutorial-auto-train-models.md).
161
161
162
162
## Time-series forecasting
163
163
164
-
Building forecasts is an integral part of any business, whether it’s revenue, inventory, sales, or customer demand. You can use automated ML to combine techniques and approaches and get a recommended, high-quality time-series forecast.
164
+
Building forecasts is an integral part of any business, whether it's revenue, inventory, sales, or customer demand. You can use automated ML to combine techniques and approaches and get a recommended, high-quality time-series forecast.
165
165
166
-
An automated time-series experiment is treated as a multivariate regression problem. Past time-series values are “pivoted” to become additional dimensions for the regressor together with other predictors. This approach, unlike classical time series methods, has an advantage of naturally incorporating multiple contextual variables and their relationship to one another during training. Automated ML learns a single, but often internally branched model for all items in the dataset and prediction horizons. More data is thus available to estimate model parameters and generalization to unseen series becomes possible.
166
+
An automated time-series experiment is treated as a multivariate regression problem. Past time-series values are "pivoted" to become additional dimensions for the regressor together with other predictors. This approach, unlike classical time series methods, has an advantage of naturally incorporating multiple contextual variables and their relationship to one another during training. Automated ML learns a single, but often internally branched model for all items in the dataset and prediction horizons. More data is thus available to estimate model parameters and generalization to unseen series becomes possible.
167
167
168
168
Learn more and see an example of [automated machine learning for time series forecasting](how-to-auto-train-forecast.md). Or, see the [energy demand notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb) for detailed code examples of advanced forecasting configuration including:
169
169
@@ -191,7 +191,7 @@ Imbalanced data is commonly found in data for machine learning classification sc
191
191
192
192
As part of its goal of simplifying the machine learning workflow, automated ML has built in capabilities to help deal with imbalanced data such as,
193
193
194
-
- A **weight column**: automated ML supports a weighted column as input, causing rows in the data to be weighted up or down, which can make a class more or less “important”.
194
+
- A **weight column**: automated ML supports a weighted column as input, causing rows in the data to be weighted up or down, which can make a class more or less "important".
195
195
196
196
- The algorithms used by automated ML can properly handle imbalance of up to 20:1, meaning the most common class can have 20 times more rows in the data than the least common class.
197
197
@@ -213,7 +213,7 @@ The following techniques are additional options to handle imbalanced data outsid
213
213
214
214
- Resampling to even the class imbalance, either by up-sampling the smaller classes or down-sampling the larger classes. These methods require expertise to process and analyze.
215
215
216
-
- Use a performance metric that deals better with imbalanced data. For example, the F1 score is a weighted average of precision and recall. Precision measures a classifier’s exactness-- low precision indicates a high number of false positives--, while recall measures a classifier’s completeness-- low recall indicates a high number of false negatives.
216
+
- Use a performance metric that deals better with imbalanced data. For example, the F1 score is a weighted average of precision and recall. Precision measures a classifier's exactness-- low precision indicates a high number of false positives--, while recall measures a classifier's completeness-- low recall indicates a high number of false negatives.
217
217
218
218
## Use with ONNX in C# apps
219
219
@@ -286,7 +286,7 @@ See examples and learn how to build models using automated machine learning:
286
286
+ Follow the [Tutorial: Automatically train a regression model with Azure Machine Learning](tutorial-auto-train-models.md)
287
287
288
288
+ Configure the settings for automatic training experiment:
289
-
+ In Azure Machine Learning studio, [use these steps](how-to-create-portal-experiments.md).
289
+
+ In Azure Machine Learning studio, [use these steps](how-to-use-automated-ml-for-ml-models.md).
290
290
+ With the Python SDK, [use these steps](how-to-configure-auto-train.md).
291
291
292
292
+ Learn how to auto train using time series data, [use these steps](how-to-auto-train-forecast.md).
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-train-machine-learning-model.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,7 +64,7 @@ Define the iterations, hyperparameter settings, featurization, and other setting
64
64
*[Examples: Jupyter Notebook examples for automated machine learning](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning)
65
65
*[How to: Configure automated ML experiments in Python](how-to-configure-auto-train.md)
66
66
*[How to: Autotrain a time-series forecast model](how-to-auto-train-forecast.md)
67
-
*[How to: Create, explore, and deploy automated machine learning experiments with [Azure Machine Learning studio](how-to-create-portal-experiments.md)
67
+
*[How to: Create, explore, and deploy automated machine learning experiments with Azure Machine Learning studio](how-to-use-automated-ml-for-ml-models.md)
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-configure-auto-train.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Configuration options available in automated machine learning:
30
30
* Explore model metrics
31
31
* Register and deploy model
32
32
33
-
If you prefer a no code experience, you can also [Create your automated machine learning experiments in Azure Machine Learning studio](how-to-create-portal-experiments.md).
33
+
If you prefer a no code experience, you can also [Create your automated machine learning experiments in Azure Machine Learning studio](how-to-use-automated-ml-for-ml-models.md).
34
34
35
35
## Select your experiment type
36
36
@@ -169,7 +169,7 @@ Some examples include:
169
169
170
170
The three different `task` parameter values (the third task-typeis`forecasting`, and uses a similar algorithm pool as`regression` tasks) determine the list of models to apply. Use the `whitelist`or`blacklist` parameters to further modify iterations with the available models to include or exclude. The list of supported models can be found on [SupportedModels Class](https://docs.microsoft.com/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels) for ([Classification](https://docs.microsoft.com/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.classification), [Forecasting](https://docs.microsoft.com/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting), and [Regression](https://docs.microsoft.com/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.regression)).
171
171
172
-
Automated ML's validation serivce will require that `experiment_timeout_minutes` be set to a minimum timeout of 15 minutes in order to help avoid experiment timeout failures.
172
+
Automated ML's validation service will require that `experiment_timeout_minutes` be set to a minimum timeout of 15 minutes in order to help avoid experiment timeout failures.
173
173
174
174
### Primary Metric
175
175
The primary metric determines the metric to be used during model training for optimization. The available metrics you can select is determined by the task type you choose, and the following table shows valid primary metrics for each task type.
@@ -186,15 +186,15 @@ Learn about the specific definitions of these metrics in [Understand automated m
186
186
187
187
### Data featurization
188
188
189
-
In every automated machine learning experiment, your data is [automatically scaled and normalized](concept-automated-ml.md#preprocess) to help *certain* algorithms that are sensitive to features that are on different scales. However, you can also enable additional featurization, such as missing values imputation, encoding, and transforms. [Learn more about what featurization is included](how-to-create-portal-experiments.md#featurization).
189
+
In every automated machine learning experiment, your data is [automatically scaled and normalized](concept-automated-ml.md#preprocess) to help *certain* algorithms that are sensitive to features that are on different scales. However, you can also enable additional featurization, such as missing values imputation, encoding, and transforms. [Learn more about what featurization is included](how-to-use-automated-ml-for-ml-models.md#featurization).
190
190
191
191
When configuring your experiments, you can enable the advanced setting `featurization`. The following table shows the accepted settings for featurization in the [`AutoMLConfig`class](https://docs.microsoft.com/python/api/azureml-train-automl/azureml.train.automl.automlconfig?view=azure-ml-py).
192
192
193
193
|Featurization Configuration | Description |
194
194
|-------------|-------------|
195
195
|`"featurization":` `'FeaturizationConfig'`| Indicates customized featurization step should be used. [Learn how to customize featurization](how-to-configure-auto-train.md#customize-feature-engineering).|
196
196
|`"featurization": 'off'`| Indicates featurization step should not be done automatically.|
197
-
|`"featurization": 'auto'`| Indicates that as part of preprocessing, [data guardrails and featurization steps](how-to-create-portal-experiments.md#advanced-featurization-options) are performed automatically.|
197
+
|`"featurization": 'auto'`| Indicates that as part of preprocessing, [data guardrails and featurization steps](how-to-use-automated-ml-for-ml-models.md#advanced-featurization-options) are performed automatically.|
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-create-register-datasets.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -210,7 +210,7 @@ To create a dataset in the studio:
210
210
1. Select **Tabular** or **File** for Dataset type.
211
211
1. Select **Next** to open the **Datastore and file selection** form. On this form you select where to keep your dataset after creation, as well as select what data files to use for your dataset.
212
212
1. Select **Next** to populate the **Settings and preview** and **Schema** forms; they are intelligently populated based on file type and you can further configure your dataset prior to creation on these forms.
213
-
1. Select **Next** to review the **Confirm details** form. Check your selections and create an optional data profile for your dataset. Learn more about [data profiling](how-to-create-portal-experiments.md#profile).
213
+
1. Select **Next** to review the **Confirm details** form. Check your selections and create an optional data profile for your dataset. Learn more about [data profiling](how-to-use-automated-ml-for-ml-models.md#profile).
214
214
1. Select **Create** to complete your dataset creation.
0 commit comments