|
1 | 1 | ---
|
2 |
| -title: Model sweeping and selection for forecasting in AutoML |
| 2 | +title: Models sweeping & selection for forecasting in AutoML |
3 | 3 | titleSuffix: Azure Machine Learning
|
4 |
| -description: Learn how Azure Machine Learning's AutoML searches for and selects forecasting models |
| 4 | +description: Explore how you can use automated machine learning (AutoML) in Azure Machine Learning to search for (sweep) and select forecasting models. |
5 | 5 | services: machine-learning
|
6 | 6 | author: ssalgadodev
|
7 | 7 | ms.author: ssalgado
|
8 | 8 | ms.reviewer: chuantian
|
9 | 9 | ms.service: azure-machine-learning
|
10 | 10 | ms.subservice: automl
|
11 |
| -ms.topic: conceptual |
| 11 | +ms.topic: concept-article |
12 | 12 | ms.custom: automl, sdkv1
|
13 |
| -ms.date: 12/15/2022 |
| 13 | +ms.date: 09/25/2024 |
| 14 | + |
| 15 | +#customer intent: As a developer, I want to use AutoML in Azure Machine Learning, so I can search for (sweep) and select forecasting models. |
14 | 16 | ---
|
15 | 17 |
|
16 | 18 | # Model sweeping and selection for forecasting in AutoML
|
17 |
| -This article focuses on how AutoML searches for and selects forecasting models. Please see the [methods overview article](./concept-automl-forecasting-methods.md) for more general information about forecasting methodology in AutoML. Instructions and examples for training forecasting models in AutoML can be found in our [set up AutoML for time series forecasting](./how-to-auto-train-forecast.md) article. |
18 | 19 |
|
19 |
| -## Model sweeping |
20 |
| -The central task for AutoML is to train and evaluate several models and choose the best one with respect to the given primary metric. The word "model" here refers to both the model class - such as ARIMA or Random Forest - and the specific hyper-parameter settings which distinguish models within a class. For instance, ARIMA refers to a class of models that share a mathematical template and a set of statistical assumptions. Training, or fitting, an ARIMA model requires a list of positive integers that specify the precise mathematical form of the model; these are the hyper-parameters. ARIMA(1, 0, 1) and ARIMA(2, 1, 2) have the same class, but different hyper-parameters and, so, can be separately fit with the training data and evaluated against each other. AutoML searches, or _sweeps_, over different model classes and within classes by varying hyper-parameters. |
| 20 | +This article describes how automated machine learning (AutoML) in Azure Machine Learning searches for and selects forecasting models. If you're interested in learning more about the forecasting methodology in AutoML, see [Overview of forecasting methods in AutoML](concept-automl-forecasting-methods.md). To explore training examples for forecasting models in AutoML, see [Set up AutoML to train a time-series forecasting model with the SDK and CLI](how-to-auto-train-forecast.md). |
| 21 | + |
| 22 | +<a name="model-sweeping"></a> |
| 23 | + |
| 24 | +## Model sweeping in AutoML |
| 25 | + |
| 26 | +The central task for AutoML is to train and evaluate several models and choose the best one with respect to the given primary metric. The word "model" in this case refers to both the model class, such as ARIMA or Random Forest, and the specific hyper-parameter settings that distinguish models within a class. For instance, ARIMA refers to a class of models that share a mathematical template and a set of statistical assumptions. Training, or _fitting_, an ARIMA model requires a list of positive integers that specify the precise mathematical form of the model. These values are the hyper-parameters. The models ARIMA(1, 0, 1) and ARIMA(2, 1, 2) have the same class, but different hyper-parameters. These definitions can be separately fit with the training data and evaluated against each other. AutoML searches, or _sweeps_, over different model classes and within classes by varying the hyper-parameters. |
| 27 | + |
| 28 | +### Hyper-parameter sweeping methods |
21 | 29 |
|
22 | 30 | The following table shows the different hyper-parameter sweeping methods that AutoML uses for different model classes:
|
23 | 31 |
|
24 |
| -Model class group | Model type | Hyper-parameter sweeping method |
25 |
| ----- | ---- | ---- |
26 |
| -Naive, Seasonal Naive, Average, Seasonal Average | Time series | No sweeping within class due to model simplicity |
27 |
| -Exponential Smoothing, ARIMA(X) | Time series | Grid search for within-class sweeping |
28 |
| -Prophet | Regression | No sweeping within class |
29 |
| -Linear SGD, LARS LASSO, Elastic Net, K Nearest Neighbors, Decision Tree, Random Forest, Extremely Randomized Trees, Gradient Boosted Trees, LightGBM, XGBoost | Regression | AutoML's [model recommendation service](https://www.microsoft.com/research/publication/probabilistic-matrix-factorization-for-automated-machine-learning/) dynamically explores hyper-parameter spaces |
30 |
| -ForecastTCN | Regression | Static list of models followed by random search over network size, dropout ratio, and learning rate. |
| 32 | +| Model class group | Model type | Hyper-parameter sweeping method | |
| 33 | +| --- | --- | --- | |
| 34 | +| Naive, Seasonal Naive, Average, Seasonal Average | Time series | No sweeping within class due to model simplicity | |
| 35 | +| Exponential Smoothing, ARIMA(X) | Time series | Grid search for within-class sweeping | |
| 36 | +| Prophet | Regression | No sweeping within class | |
| 37 | +| Linear SGD, LARS LASSO, Elastic Net, K Nearest Neighbors, Decision Tree, Random Forest, Extremely Randomized Trees, Gradient Boosted Trees, LightGBM, XGBoost | Regression | AutoML's [model recommendation service](https://www.microsoft.com/research/publication/probabilistic-matrix-factorization-for-automated-machine-learning/) dynamically explores hyper-parameter spaces | |
| 38 | +| ForecastTCN | Regression | Static list of models followed by random search over network size, dropout ratio, and learning rate | |
| 39 | + |
| 40 | +For a description of the different model types, see the [Forecasting models in AutoML](concept-automl-forecasting-methods.md#forecasting-models-in-automl) section of the forecasting methods overview article. |
31 | 41 |
|
32 |
| -For a description of the different model types, see the [forecasting models](./concept-automl-forecasting-methods.md#forecasting-models-in-automl) section of the methods overview article. |
| 42 | +The amount of sweeping by AutoML depends on the forecasting job configuration. You can specify the stopping criteria as a time limit or a limit on the number of trials, or the equivalent number of models. Early termination logic can be used in both cases to stop sweeping if the primary metric isn't improving. |
33 | 43 |
|
34 |
| -The amount of sweeping that AutoML does depends on the forecasting job configuration. You can specify the stopping criteria as a time limit or a limit on the number of trials, or equivalently the number of models. Early termination logic can be used in both cases to stop sweeping if the primary metric is not improving. |
| 44 | +<a name="model-selection"></a> |
35 | 45 |
|
36 |
| -## Model selection |
37 |
| -AutoML forecasting model search and selection proceeds in the following three phases: |
| 46 | +## Model selection in AutoML |
38 | 47 |
|
39 |
| -1. Sweep over time series models and select the best model from _each class_ using [penalized likelihood methods](https://otexts.com/fpp3/arima-estimation.html#information-criteria). |
40 |
| -2. Sweep over regression models and rank them, along with the best time series models from phase 1, according to their primary metric values from validation sets. |
41 |
| -3. Build an ensemble model from the top ranked models, calculate its validation metric, and rank it with the other models. |
| 48 | +AutoML follows a three-phase process to search for and select forecasting models: |
| 49 | + |
| 50 | +- **Phase 1**: Sweep over time-series models and select the best model from _each class_ by using [maximum likelihood estimation](https://otexts.com/fpp3/arima-estimation.html#information-criteria) methods. |
| 51 | + |
| 52 | +- **Phase 2**: Sweep over regression models and rank them, along with the best time-series models from phase 1, according to their primary metric values from validation sets. |
| 53 | + |
| 54 | +- **Phase 3**: Build an ensemble model from the top ranked models, calculate its validation metric, and rank it with the other models. |
42 | 55 |
|
43 | 56 | The model with the top ranked metric value at the end of phase 3 is designated the best model.
|
44 | 57 |
|
45 | 58 | > [!IMPORTANT]
|
46 |
| -> AutoML's final phase of model selection always calculates metrics on **out-of-sample** data. That is, data that was not used to fit the models. This helps to protect against over-fitting. |
| 59 | +> In Phase 3, AutoML always calculates metrics on **out-of-sample** data that isn't used to fit the models. This approach helps to protect against over-fitting. |
47 | 60 |
|
48 |
| -AutoML has two validation configurations - cross-validation and explicit validation data. In the cross-validation case, AutoML uses the input configuration to create data splits into training and validation folds. Time order must be preserved in these splits, so AutoML uses so-called **Rolling Origin Cross Validation** which divides the series into training and validation data using an origin time point. Sliding the origin in time generates the cross-validation folds. Each validation fold contains the next horizon of observations immediately following the position of the origin for the given fold. This strategy preserves the time series data integrity and mitigates the risk of information leakage. |
| 61 | +### Validation configurations |
49 | 62 |
|
50 |
| -:::image type="content" source="media/how-to-auto-train-forecast/rolling-origin-cross-validation.png" alt-text="Diagram showing cross validation folds separating the training and validation sets based on the cross validation step size."::: |
| 63 | +AutoML has two validation configurations: cross-validation and explicit validation data. |
| 64 | + |
| 65 | +In the cross-validation case, AutoML uses the input configuration to create data splits into training and validation folds. Time order must be preserved in these splits. AutoML uses so-called **Rolling Origin Cross Validation**, which divides the series into training and validation data by using an origin time point. Sliding the origin in time generates the cross-validation folds. Each validation fold contains the next horizon of observations immediately following the position of the origin for the given fold. This strategy preserves the time-series data integrity and mitigates the risk of information leakage. |
| 66 | + |
| 67 | +:::image type="content" source="media/how-to-auto-train-forecast/rolling-origin-cross-validation.png" border="false" alt-text="Diagram showing cross validation folds separating the training and validation sets based on the cross validation step size."::: |
51 | 68 |
|
52 | 69 | AutoML follows the usual cross-validation procedure, training a separate model on each fold and averaging validation metrics from all folds.
|
53 | 70 |
|
54 |
| -Cross-validation for forecasting jobs is configured by setting the number of cross-validation folds and, optionally, the number of time periods between two consecutive cross-validation folds. See the [custom cross-validation settings](./how-to-auto-train-forecast.md#custom-cross-validation-settings) guide for more information and an example of configuring cross-validation for forecasting. |
| 71 | +Cross-validation for forecasting jobs is configured by setting the number of cross-validation folds, and optionally, the number of time periods between two consecutive cross-validation folds. For more information and an example of configuring cross-validation for forecasting, see [Custom cross-validation settings](how-to-auto-train-forecast.md#custom-cross-validation-settings). |
| 72 | + |
| 73 | +You can also bring your own validation data. For more information, see [Configure training, validation, cross-validation, and test data in AutoML (SDK v1)](./v1/how-to-configure-cross-validation-data-splits.md#provide-validation-data). |
55 | 74 |
|
56 |
| -You can also bring your own validation data. Learn more in the [configure data splits and cross-validation in AutoML (SDK v1)](./v1/how-to-configure-cross-validation-data-splits.md#provide-validation-data) article. |
| 75 | +## Related content |
57 | 76 |
|
58 |
| -## Next steps |
59 |
| -* Learn more about [how to set up AutoML to train a time-series forecasting model](./how-to-auto-train-forecast.md). |
60 |
| -* Browse [AutoML Forecasting Frequently Asked Questions](./how-to-automl-forecasting-faq.md). |
61 |
| -* Learn about [calendar features for time series forecasting in AutoML](./concept-automl-forecasting-calendar-features.md). |
62 |
| -* Learn about [how AutoML uses machine learning to build forecasting models](./concept-automl-forecasting-methods.md). |
| 77 | +- [Train time-series forecasting models with AutoML](how-to-auto-train-forecast.md) |
| 78 | +- Browse [FAQ for AutoML forecasting](how-to-automl-forecasting-faq.md) |
| 79 | +- Explore [how AutoML uses machine learning to build forecasting models](concept-automl-forecasting-methods.md) |
0 commit comments