You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-auto-train-forecast.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,7 +50,7 @@ For this article you need,
50
50
Input data for AutoML forecasting must contain valid time series in tabular format. Each variable must have its own corresponding column in the data table. AutoML requires at least two columns: a **time column** representing the time axis and the **target column** which is the quantity to forecast. Other columns can serve as predictors. For more details, see [how AutoML uses your data](./concept-automl-forecasting-methods.md#how-automl-uses-your-data).
51
51
52
52
> [!IMPORTANT]
53
-
> When training a model for forecasting future values, ensure all the features used in training can be used when running predictions for your intended horizon. For example, a feature for current stock price could massively increase training accuracy. However, if you intend to forecast with a long horizon, you may not be able to accurately predict future stock values corresponding to future time-series points, and model accuracy could suffer.
53
+
> When training a model for forecasting future values, ensure all the features used in training can be used when running predictions for your intended horizon. <br> <br> For example, a feature for current stock price could massively increase training accuracy. However, if you intend to forecast with a long horizon, you may not be able to accurately predict future stock values corresponding to future time-series points, and model accuracy could suffer.
54
54
55
55
AutoML forecasting jobs require that your training data is represented as an **MLTable** object. An MLTable specifies a data source and steps for loading the data. For more information and use cases, see the [MLTable how-to guide](./how-to-mltable.md). As a simple example, suppose your training data is contained in a CSV file in a local directory, `./train_data/timeseries_train.csv`. You can define a new MLTable by copying the following YAML code to a new file, `./train_data/MLTable`:
56
56
@@ -217,8 +217,8 @@ AutoML has several actions it can take for short series. These actions are confi
217
217
218
218
|Setting|Description
219
219
|---|---
220
-
|`auto`| The default value of the setting. <li> *If all series are short*, pad the data. <br> <li> *If not all series are short*, drop the short series.
221
-
|`pad`| If `short_series_handling_config = pad`, then automated ML adds random values to each short series found. The following lists the column types and what they're padded with: <li>Object columns with NaNs <li> Numeric columns with 0 <li> Boolean/logic columns with False <li> The target column is padded with random values with mean of zero and standard deviation of 1.
220
+
|`auto`| The default value for short series handling. <br> - _If all series are short_, pad the data. <br> - _If not all series are short_, drop the short series.
221
+
|`pad`| If `short_series_handling_config = pad`, then automated ML adds random values to each short series found. The following lists the column types and what they're padded with: <br> - Object columns with NaNs <br> - Numeric columns with 0 <br> - Boolean/logic columns with False <br> - The target column is padded with random values with mean of zero and standard deviation of 1.
222
222
|`drop`| If `short_series_handling_config = drop`, then automated ML drops the short series, and it will not be used for training or prediction. Predictions for these series will return NaN's.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-automl-forecasting-faq.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,10 +51,10 @@ There are four basic configurations supported by AutoML forecasting:
51
51
52
52
|Configuration|Scenario|Pros|Cons|
53
53
|--|--|--|--|
54
-
|**Default AutoML**|Recommended if the dataset has a small number of time series that have roughly similar historic behavior.|<li> Simple to configure from code/SDK or AzureML Studio <br> <li> AutoML has the chance to cross-learn across different time series since the regression models pool all series together in training. See the [model grouping](./concept-automl-forecasting-methods.md#model-grouping) section for more information.|<li> Regression models may be less accurate if the time series in the training data have divergent behavior <br> <li> Time series models may take a long time to train if there are a large number of series in the training data. See the ["why is AutoML slow on my data"](#why-is-automl-slow-on-my-data) answer for more information.|
55
-
|**AutoML with deep learning**|Recommended for datasets with more than 1000 observations and, potentially, numerous time series exhibiting complex patterns. When enabled, AutoML will sweep over temporal convolutional neural network (TCN) models during training. See the [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning) section for more information.|<li> Simple to configure from code/SDK or AzureML Studio <br> <li> Cross-learning opportunities since the TCN pools data over all series <br> <li> Potentially higher accuracy due to the large capacity of DNN models. See the [forecasting models in AutoML](./concept-automl-forecasting-methods.md#forecasting-models-in-automl) section for more information.|<li> Training can take much longer due to the complexity of DNN models <br> <li> Series with small amounts of history are unlikely to benefit from these models.|
56
-
|**Many Models**|Recommended if you need to train and manage a large number of forecasting models in a scalable way. See the [forecasting at scale](./how-to-auto-train-forecast.md#forecasting-at-scale) section for more information.|<li> Scalable <br> <li> Potentially higher accuracy when time series have divergent behavior from one another.|<li> No cross-learning across time series <br> <li> You can't configure or launch Many Models jobs from AzureML Studio, only the code/SDK experience is currently available.|
57
-
|**Hierarchical Time Series**|HTS is recommended if the series in your data have nested, hierarchical structure and you need to train or make forecasts at aggregated levels of the hierarchy. See the [hierarchical time series forecasting](how-to-auto-train-forecast.md#hierarchical-time-series-forecasting) section for more information.|<li> Training at aggregated levels can reduce noise in the leaf node time series and potentially lead to higher accuracy models. <br> <li> Forecasts can be retrieved for any level of the hierarchy by aggregating or dis-aggregating forecasts from the training level.|You need to provide the aggregation level for training. AutoML doesn't currently have an algorithm to find an optimal level.|
54
+
|**Default AutoML**|Recommended if the dataset has a small number of time series that have roughly similar historic behavior.|- Simple to configure from code/SDK or AzureML Studio <br><br> - AutoML has the chance to cross-learn across different time series since the regression models pool all series together in training. See the [model grouping](./concept-automl-forecasting-methods.md#model-grouping) section for more information.|- Regression models may be less accurate if the time series in the training data have divergent behavior <br> <br> - Time series models may take a long time to train if there are a large number of series in the training data. See the ["why is AutoML slow on my data"](#why-is-automl-slow-on-my-data) answer for more information.|
55
+
|**AutoML with deep learning**|Recommended for datasets with more than 1000 observations and, potentially, numerous time series exhibiting complex patterns. When enabled, AutoML will sweep over temporal convolutional neural network (TCN) models during training. See the [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning) section for more information.|- Simple to configure from code/SDK or AzureML Studio <br> <br> - Cross-learning opportunities since the TCN pools data over all series <br> <br> - Potentially higher accuracy due to the large capacity of DNN models. See the [forecasting models in AutoML](./concept-automl-forecasting-methods.md#forecasting-models-in-automl) section for more information.|- Training can take much longer due to the complexity of DNN models <br> <br> - Series with small amounts of history are unlikely to benefit from these models.|
56
+
|**Many Models**|Recommended if you need to train and manage a large number of forecasting models in a scalable way. See the [forecasting at scale](./how-to-auto-train-forecast.md#forecasting-at-scale) section for more information.|- Scalable <br> <br> - Potentially higher accuracy when time series have divergent behavior from one another.|- No cross-learning across time series <br> <br> - You can't configure or launch Many Models jobs from AzureML Studio, only the code/SDK experience is currently available.|
57
+
|**Hierarchical Time Series**|HTS is recommended if the series in your data have nested, hierarchical structure and you need to train or make forecasts at aggregated levels of the hierarchy. See the [hierarchical time series forecasting](how-to-auto-train-forecast.md#hierarchical-time-series-forecasting) section for more information.|- Training at aggregated levels can reduce noise in the leaf node time series and potentially lead to higher accuracy models. <br> <br> - Forecasts can be retrieved for any level of the hierarchy by aggregating or dis-aggregating forecasts from the training level.|- You need to provide the aggregation level for training. AutoML doesn't currently have an algorithm to find an optimal level.|
58
58
59
59
> [!NOTE]
60
60
> We recommend using compute nodes with GPUs when deep learning is enabled to best take advantage of high DNN capacity. Training time can be much faster in comparison to nodes with only CPUs. See the GPU optimized compute article for more information.
Copy file name to clipboardExpand all lines: articles/machine-learning/v1/how-to-auto-train-forecast-v1.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -277,8 +277,8 @@ The following table summarizes the available settings for `short_series_handling
277
277
278
278
|Setting|Description
279
279
|---|---
280
-
|`auto`| The following is the default behavior for short series handling <li> *If all series are short*, pad the data. <br> <li> *If not all series are short*, drop the short series.
281
-
|`pad`| If `short_series_handling_config = pad`, then automated ML adds random values to each short series found. The following lists the column types and what they are padded with: <li>Object columns with NaNs <li> Numeric columns with 0 <li> Boolean/logic columns with False <li> The target column is padded with random values with mean of zero and standard deviation of 1.
280
+
|`auto`| The default value for short series handling. <br> - _If all series are short_, pad the data. <br> - _If not all series are short_, drop the short series.
281
+
|`pad`| If `short_series_handling_config = pad`, then automated ML adds random values to each short series found. The following lists the column types and what they're padded with: <br> - Object columns with NaNs <br> - Numeric columns with 0 <br> - Boolean/logic columns with False <br> - The target column is padded with random values with mean of zero and standard deviation of 1.
282
282
|`drop`| If `short_series_handling_config = drop`, then automated ML drops the short series, and it will not be used for training or prediction. Predictions for these series will return NaN's.
0 commit comments