You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This article focuses on the deep learning methods for time series forecasting in AutoML. Instructions and examples for training forecasting models in AutoML can be found in our [set up AutoML for time series forecasting](./how-to-auto-train-forecast.md) article.
20
20
21
-
Deep learning has made a major impact in fields ranging from [language modeling](../ai-services/openai/concepts/models.md) to [protein folding](https://www.deepmind.com/research/highlighted-research/alphafold), among many others. Time series forecasting has likewise benefitted from recent advances in deep learning technology. For example, deep neural network (DNN) models feature prominently in the top performing models from the [fourth](https://www.uber.com/blog/m4-forecasting-competition/) and [fifth](https://www.sciencedirect.com/science/article/pii/S0169207021001874) iterations of the high-profile Makridakis forecasting competition.
21
+
Deep learning has numerous use cases in fields ranging from [language modeling](../ai-services/openai/concepts/models.md) to [protein folding](https://www.deepmind.com/research/highlighted-research/alphafold), among many others. Time series forecasting also benefits from recent advances in deep learning technology. For example, deep neural network (DNN) models feature prominently in the top performing models from the [fourth](https://www.uber.com/blog/m4-forecasting-competition/) and [fifth](https://www.sciencedirect.com/science/article/pii/S0169207021001874) iterations of the high-profile Makridakis forecasting competition.
22
22
23
-
In this article, we'll describe the structure and operation of the TCNForecaster model in AutoML to help you best apply the model to your scenario.
23
+
In this article, we describe the structure and operation of the TCNForecaster model in AutoML to help you best apply the model to your scenario.
24
24
25
25
## Introduction to TCNForecaster
26
26
27
-
TCNForecaster is a [temporal convolutional network](https://arxiv.org/abs/1803.01271), or TCN, which has a DNN architecture specifically designed for time series data. The model uses historical data for a target quantity, along with related features, to make probabilistic forecasts of the target up to a specified forecast horizon. The following image shows the major components of the TCNForecaster architecture:
27
+
TCNForecaster is a [temporal convolutional network](https://arxiv.org/abs/1803.01271), or TCN, which has a DNN architecture designed for time series data. The model uses historical data for a target quantity, along with related features, to make probabilistic forecasts of the target up to a specified forecast horizon. The following image shows the major components of the TCNForecaster architecture:
28
28
29
29
:::image type="content" source="media/how-to-auto-train-forecast/tcn-basic.png" alt-text="Diagram showing major components of AutoML's TCNForecaster.":::
30
30
31
31
TCNForecaster has the following main components:
32
32
33
-
* A **pre-mix** layer that mixes the input time series and feature data into an array of signal **channels** that the convolutional stack will process.
33
+
* A **pre-mix** layer that mixes the input time series and feature data into an array of signal **channels** that the convolutional stack processes.
34
34
* A stack of **dilated convolution** layers that processes the channel array sequentially; each layer in the stack processes the output of the previous layer to produce a new channel array. Each channel in this output contains a mixture of convolution-filtered signals from the input channels.
35
35
* A collection of **forecast head** units that coalesce the output signals from the convolution layers and generate forecasts of the target quantity from this latent representation. Each head unit produces forecasts up to the horizon for a quantile of the prediction distribution.
36
36
@@ -42,7 +42,7 @@ Stacking dilated convolutions gives the TCN the ability to model correlations ov
The dashed lines show paths through the network that end on the output at a time $t$. These paths cover the last eight points in the input, illustrating that each output point is a function of the eight most relatively recent points in the input. The length of history, or "look back," that a convolutional network uses to make predictions is called the **receptive field** and it is determined completely by the TCN architecture.
45
+
The dashed lines show paths through the network that end on the output at a time $t$. These paths cover the last eight points in the input, illustrating that each output point is a function of the eight most relatively recent points in the input. The length of history, or "look back," that a convolutional network uses to make predictions is called the **receptive field** and it's determined completely by the TCN architecture.
46
46
47
47
### TCNForecaster architecture
48
48
@@ -66,7 +66,7 @@ We can give a more precise definition of the TCNForecaster architecture in terms
where $W_{e}$ is an [embedding](https://huggingface.co/blog/getting-started-with-embeddings) matrix for the categorical features, $n_{l} = n_{b}n_{c}$ is the total number of residual cells, the $H_{k}$ denote hidden layer outputs, and the $f_{q}$ are forecast outputs for given quantiles of the prediction distribution. To aid understanding, the dimensions of these variables are in the following table:
69
+
Where $W_{e}$ is an [embedding](https://huggingface.co/blog/getting-started-with-embeddings) matrix for the categorical features, $n_{l} = n_{b}n_{c}$ is the total number of residual cells, the $H_{k}$ denote hidden layer outputs, and the $f_{q}$ are forecast outputs for given quantiles of the prediction distribution. To aid understanding, the dimensions of these variables are in the following table:
70
70
71
71
|Variable|Description|Dimensions|
72
72
|--|--|--|
@@ -80,7 +80,7 @@ In the table, $n_{\text{input}} = n_{\text{features}} + 1$, the number of predic
80
80
81
81
TCNForecaster is an optional model in AutoML. To learn how to use it, see [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning).
82
82
83
-
In this section, we'll describe how AutoML builds TCNForecaster models with your data, including explanations of data preprocessing, training, and model search.
83
+
In this section, we describe how AutoML builds TCNForecaster models with your data, including explanations of data preprocessing, training, and model search.
84
84
85
85
### Data preprocessing steps
86
86
@@ -94,7 +94,7 @@ Fill missing data|[Impute missing values and observation gaps](./concept-automl-
94
94
|Target transform|Optionally apply the natural logarithm function to the target depending on the results of certain statistical tests.|
95
95
|Normalization|[Z-score normalize](https://en.wikipedia.org/wiki/Standard_score) all numeric data; normalization is performed per feature and per time series group, as defined by the [time series ID columns](./how-to-auto-train-forecast.md#forecasting-job-settings).
96
96
97
-
These steps are included in AutoML's transform pipelines, so they are automatically applied when needed at inference time. In some cases, the inverse operation to a step is included in the inference pipeline. For example, if AutoML applied a $\log$ transform to the target during training, the raw forecasts are exponentiated in the inference pipeline.
97
+
These steps are included in AutoML's transform pipelines, so they're automatically applied when needed at inference time. In some cases, the inverse operation to a step is included in the inference pipeline. For example, if AutoML applied a $\log$ transform to the target during training, the raw forecasts are exponentiated in the inference pipeline.
98
98
99
99
### Training
100
100
@@ -135,7 +135,7 @@ The model search has two phases:
135
135
1. AutoML performs a search over 12 "landmark" models. The landmark models are static and chosen to reasonably span the hyper-parameter space.
136
136
2. AutoML continues searching through the hyper-parameter space using a random search.
137
137
138
-
The search terminates when stopping criteria are met. The stopping criteria depend on the [forecast training job configuration](./how-to-auto-train-forecast.md#configure-experiment), but some examples include time limits, limits on number of search trials to perform, and early stopping logic when the validation metric is not improving.
138
+
The search terminates when stopping criteria are met. The stopping criteria depend on the [forecast training job configuration](./how-to-auto-train-forecast.md#configure-experiment), but some examples include time limits, limits on number of search trials to perform, and early stopping logic when the validation metric isn't improving.
0 commit comments