You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, you'll learn how to set up AutoML training for time-series forecasting models with Azure Machine Learning automated ML in the [Azure Machine Learning Python SDK](/python/api/overview/azure/ai-ml-readme).
22
26
23
27
To do so, you:
@@ -141,6 +145,28 @@ Other settings are optional and reviewed in the [optional settings](#optional-se
141
145
142
146
Optional configurations are available for forecasting tasks, such as enabling deep learning and specifying a target rolling window aggregation. A complete list of parameters is available in the [forecast_settings API doc](/python/api/azure-ai-ml/azure.ai.ml.automl.forecastingjob#azure-ai-ml-automl-forecastingjob-set-forecast-settings).
143
147
148
+
#### Model search settings
149
+
150
+
There are two optional settings that control the model space where AutoML searches for the best model, `allowed_training_algorithms` and `blocked_training_algorithms`. To restrict the search space to a given set of model classes, use allowed_training_algorithms as in the following sample:
151
+
152
+
```python
153
+
# Only search ExponentialSmoothing and ElasticNet models
In this case, the forecasting job _only_ searches over Exponential Smoothing and Elastic Net model classes. To remove a given set of model classes from the search space, use the blocked_training_algorithms as in the following sample:
160
+
161
+
```python
162
+
# Search over all model classes except Prophet
163
+
forecasting_job.set_training(
164
+
blocked_training_algorithms=["Prophet"]
165
+
)
166
+
```
167
+
168
+
Now, the job searches over all model classes _except_ Prophet. For a list of forecasting model names that are accepted in `allowed_training_algorithms` and `blocked_training_algorithms`, see [supported forecasting models](/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting) and [supported regression models](/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.regression).
169
+
144
170
#### Enable deep learning
145
171
146
172
AutoML ships with a custom deep neural network (DNN) model called `ForecastTCN`. This model is a [temporal convolutional network](https://arxiv.org/abs/1803.01271), or TCN, that applies common imaging task methods to time series modeling. Namely, one-dimensional "causal" convolutions form the backbone of the network and enable the model to learn complex patterns over long durations in the training history.
One or both of these settings can be set to `"auto"` if you want AutoML to make the determination.
261
-
262
286
### Custom featurization
263
287
264
288
By default, AutoML augments training data with engineered features to increase the accuracy of the models. See [automated feature engineering](./concept-automl-forecasting-methods.md#automated-feature-engineering) for more information. Some of the preprocessing steps can be customized using the `set_featurization()` method of the forecasting job.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-automl-forecasting-faq.md
+18-13Lines changed: 18 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,8 +21,8 @@ This article answers common questions about forecasting in AutoML. See the [meth
21
21
22
22
## How do I start building forecasting models in AutoML?
23
23
You can start by reading our guide on [setting up AutoML to train a time-series forecasting model with Python](./how-to-auto-train-forecast.md). We've also provided hands-on examples in several Jupyter notebooks:
2.[Forecasting using deep learning](https://github.com/Azure/azureml-examples/blob/main/v1/python-sdk/tutorials/automl-with-azureml/forecasting-github-dau/auto-ml-forecasting-github-dau.ipynb)
2.[Forecasting using deep learning](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-github-dau/auto-ml-forecasting-github-dau.ipynb)
@@ -36,7 +36,7 @@ One common source of slow runtime is training AutoML with default settings on da
36
36
## How can I make AutoML faster?
37
37
See the ["why is AutoML slow on my data"](#why-is-automl-slow-on-my-data) answer to understand why it may be slow in your case.
38
38
Consider the following configuration changes that may speed up your job:
39
-
- Block time series models like ARIMA and Prophet
39
+
-[Block time series models](./how-to-auto-train-forecast.md#model-search-settings) like ARIMA and Prophet
40
40
- Turn off look-back features like lags and rolling windows
41
41
- Reduce
42
42
- number of trials/iterations
@@ -49,12 +49,12 @@ Consider the following configuration changes that may speed up your job:
49
49
50
50
There are four basic configurations supported by AutoML forecasting:
51
51
52
-
Configuration|Scenario|Pros|Cons
53
-
--|--|--|--
54
-
**Default AutoML**|Recommended if the dataset has a small number of time series that have roughly similar historic behavior.|<li> Simple to configure from code/SDK or AzureML Studio <br> <li> AutoML has the chance to cross-learn across different time series since the regression models pool all series together in training. See the [model grouping](./concept-automl-forecasting-methods.md#model-grouping) section for more information.|<li> Regression models may be less accurate if the time series in the training data have divergent behavior <br> <li> Time series models may take a long time to train if there are a large number of series in the training data. See the ["why is AutoML slow on my data"](#why-is-automl-slow-on-my-data) answer for more information.
55
-
**AutoML with deep learning**|Recommended for datasets with more than 1000 observations and, potentially, numerous time series exhibiting complex patterns. When enabled, AutoML will sweep over temporal convolutional neural network (TCN) models during training. See the [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning) section for more information.|<li> Simple to configure from code/SDK or AzureML Studio <br> <li> Cross-learning opportunities since the TCN pools data over all series <br> <li> Potentially higher accuracy due to the large capacity of DNN models. See the [forecasting models in AutoML](./concept-automl-forecasting-methods.md#forecasting-models-in-automl) section for more information.|<li> Training can take much longer due to the complexity of DNN models <br> <li> Series with small amounts of history are unlikely to benefit from these models.
56
-
**Many Models**|Recommended if you need to train and manage a large number of forecasting models in a scalable way. See the [forecasting at scale](./how-to-auto-train-forecast.md#forecasting-at-scale) section for more information.|<li> Scalable <br> <li> Potentially higher accuracy when time series have divergent behavior from one another.|<li> No cross-learning across time series <br> <li> You can't configure or launch Many Models jobs from AzureML Studio, only the code/SDK experience is currently available.
57
-
**Hierarchical Time Series**|HTS is recommended if the series in your data have nested, hierarchical structure and you need to train or make forecasts at aggregated levels of the hierarchy. See the [hierarchical time series forecasting](how-to-auto-train-forecast.md#hierarchical-time-series-forecasting) section for more information.|<li> Training at aggregated levels can reduce noise in the leaf node time series and potentially lead to higher accuracy models. <br> <li> Forecasts can be retrieved for any level of the hierarchy by aggregating or dis-aggregating forecasts from the training level.|You need to provide the aggregation level for training. AutoML doesn't currently have an algorithm to find an optimal level.
52
+
|Configuration|Scenario|Pros|Cons|
53
+
|--|--|--|--|
54
+
|**Default AutoML**|Recommended if the dataset has a small number of time series that have roughly similar historic behavior.|<li> Simple to configure from code/SDK or AzureML Studio <br> <li> AutoML has the chance to cross-learn across different time series since the regression models pool all series together in training. See the [model grouping](./concept-automl-forecasting-methods.md#model-grouping) section for more information.|<li> Regression models may be less accurate if the time series in the training data have divergent behavior <br> <li> Time series models may take a long time to train if there are a large number of series in the training data. See the ["why is AutoML slow on my data"](#why-is-automl-slow-on-my-data) answer for more information.|
55
+
|**AutoML with deep learning**|Recommended for datasets with more than 1000 observations and, potentially, numerous time series exhibiting complex patterns. When enabled, AutoML will sweep over temporal convolutional neural network (TCN) models during training. See the [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning) section for more information.|<li> Simple to configure from code/SDK or AzureML Studio <br> <li> Cross-learning opportunities since the TCN pools data over all series <br> <li> Potentially higher accuracy due to the large capacity of DNN models. See the [forecasting models in AutoML](./concept-automl-forecasting-methods.md#forecasting-models-in-automl) section for more information.|<li> Training can take much longer due to the complexity of DNN models <br> <li> Series with small amounts of history are unlikely to benefit from these models.|
56
+
|**Many Models**|Recommended if you need to train and manage a large number of forecasting models in a scalable way. See the [forecasting at scale](./how-to-auto-train-forecast.md#forecasting-at-scale) section for more information.|<li> Scalable <br> <li> Potentially higher accuracy when time series have divergent behavior from one another.|<li> No cross-learning across time series <br> <li> You can't configure or launch Many Models jobs from AzureML Studio, only the code/SDK experience is currently available.|
57
+
|**Hierarchical Time Series**|HTS is recommended if the series in your data have nested, hierarchical structure and you need to train or make forecasts at aggregated levels of the hierarchy. See the [hierarchical time series forecasting](how-to-auto-train-forecast.md#hierarchical-time-series-forecasting) section for more information.|<li> Training at aggregated levels can reduce noise in the leaf node time series and potentially lead to higher accuracy models. <br> <li> Forecasts can be retrieved for any level of the hierarchy by aggregating or dis-aggregating forecasts from the training level.|You need to provide the aggregation level for training. AutoML doesn't currently have an algorithm to find an optimal level.|
58
58
59
59
> [!NOTE]
60
60
> We recommend using compute nodes with GPUs when deep learning is enabled to best take advantage of high DNN capacity. Training time can be much faster in comparison to nodes with only CPUs. See the GPU optimized compute article for more information.
@@ -68,7 +68,8 @@ AutoML uses machine learning best practices, such as cross-validated model selec
68
68
69
69
- The input data contains **feature columns that are derived from the target with a simple formula**. For example, a feature that is an exact multiple of the target can result in a nearly perfect training score. The model, however, will likely not generalize to out-of-sample data. We advise you to explore the data prior to model training and to drop columns that "leak" the target information.
70
70
- The training data uses **features that are not known into the future**, up to the forecast horizon. AutoML's regression models currently assume all features are known to the forecast horizon. We advise you to explore your data prior to training and remove any feature columns that are only known historically.
71
-
- There are **significant structural differences - regime changes - between the training, validation, or test portions of the data**. For example, consider the effect of the COVID-19 pandemic on demand for almost any good during 2020 and 2021; this is a classic example of a regime change. Over-fitting due to regime change is the most challenging issue to address because it's highly scenario dependent and can require deep knowledge to identify. As a first line of defense, try to reserve 10 - 20% of the total history for validation, or cross-validation, data. It isn't always possible to reserve this amount of validation data if the training history is short, but is a best practice. See our guide on [configuring validation](./how-to-auto-train-forecast.md#training-and-validation-data) for more information.
71
+
- There are **significant structural differences - regime changes - between the training, validation, or test portions of the data**. For example, consider the effect of the COVID-19 pandemic on demand for almost any good during 2020 and 2021; this is a classic example of a regime change. Over-fitting due to regime change is the most challenging issue to address because it's highly scenario dependent and can require deep knowledge to identify. As a first line of defense, try to reserve 10 - 20% of the total history for validation, or cross-validation, data. It isn't always possible to reserve this amount of validation data if the training history is short, but is a best practice. See our guide on [configuring validation](./how-to-auto-train-forecast.md#training-and-validation-data) for more information.
72
+
72
73
73
74
## What if my time series data doesn't have regularly spaced observations?
74
75
@@ -98,7 +99,11 @@ The primary metric is very important since its value on validation data determin
98
99
- Add new features that may help predict the target. Subject matter expertise can help greatly when selecting training data.
99
100
- Compare validation and test metric values and determine if the selected model is under-fitting or over-fitting the data. This knowledge can guide you to a better training configuration. For example, you might determine that you need to use more cross-validation folds in response to over-fitting.
100
101
101
-
### How do I fix an Out-Of-Memory error?
102
+
## Will AutoML always select the same best model given the same training data and configuration?
103
+
104
+
[AutoML's model search process](./concept-automl-forecasting-sweeping.md#model-sweeping) is not deterministic, so it does not always select the same model given the same data and configuration.
105
+
106
+
## How do I fix an Out-Of-Memory error?
102
107
103
108
There are two types of memory issues:
104
109
- RAM Out-of-Memory
@@ -110,7 +115,7 @@ For default AutoML settings, RAM Out-of-Memory may be fixed by using compute nod
110
115
111
116
Disk Out-of-Memory errors may be resolved by deleting the compute cluster and creating a new one.
112
117
113
-
###What advanced forecasting scenarios are supported by AutoML?
118
+
## What advanced forecasting scenarios are supported by AutoML?
114
119
115
120
We support the following advanced prediction scenarios:
116
121
- Quantile forecasts
@@ -133,7 +138,7 @@ If your AutoML forecasting job fails, you'll see an error message in the studio
133
138
> [!NOTE]
134
139
> For Many Models or HTS job, training is usually on multi-node compute clusters. Logs for these jobs are present for each node IP address. You will need to search for error logs in each node in this case. The error logs, along with the driver logs, are in the `user_logs` folder for each node IP.
135
140
136
-
###What is a workspace / environment / experiment/ compute instance / compute target?
141
+
## What is a workspace / environment / experiment/ compute instance / compute target?
137
142
138
143
If you aren't familiar with Azure Machine Learning concepts, start with the ["What is AzureML"](overview-what-is-azure-machine-learning.md) article and the [workspaces](./concept-workspace.md) article.
In this article, you learn how to set up AutoML training for time-series forecasting models with Azure Machine Learning automated ML in the [Azure Machine Learning Python SDK](/python/api/overview/azure/ml/).
0 commit comments