Skip to content

Commit ac790c4

Browse files
authored
Merge pull request #68 from TimShererWithAquent/us302403d
AI Freshness - Machine Learning how-to
2 parents d4c68af + 55894e3 commit ac790c4

File tree

1 file changed

+42
-43
lines changed

1 file changed

+42
-43
lines changed
Lines changed: 42 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Inference and evaluation of forecasting models
33
titleSuffix: Azure Machine Learning
4-
description: Learn about different ways to inference and evaluate forecasting models
4+
description: Learn about different ways to inference and evaluate forecasting models using Azure Machine Learning.
55
services: machine-learning
66
author: ssalgadodev
77
ms.author: ssalgado
@@ -10,103 +10,102 @@ ms.service: azure-machine-learning
1010
ms.subservice: automl
1111
ms.topic: conceptual
1212
ms.custom: automl, sdkv2
13-
ms.date: 08/21/2024
13+
ms.date: 09/05/2024
1414
show_latex: true
15+
#customer intent: As a data scientist, I want to understand model inference and evaluation in forecasting tasks.
1516
---
1617

1718
# Inference and evaluation of forecasting models (preview)
1819

1920
[!INCLUDE [machine-learning-preview-generic-disclaimer](./includes/machine-learning-preview-generic-disclaimer.md)]
2021

21-
This article introduces concepts related to model inference and evaluation in forecasting tasks. Instructions and examples for training forecasting models in AutoML can be found in our [set up AutoML for time series forecasting](./how-to-auto-train-forecast.md) article.
22+
This article introduces concepts related to model inference and evaluation in forecasting tasks. For instructions and examples for training forecasting models in AutoML, see [Set up AutoML to train a time-series forecasting model with SDK and CLI](./how-to-auto-train-forecast.md).
2223

23-
Once you've used AutoML to train and select a best model, the next step is to generate forecasts and then, if possible, to evaluate their accuracy on a test set held out from the training data. To see how to setup and run forecasting model evaluation in automated machine learning, see our guide on [inference and evaluation components](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
24+
After you use AutoML to train and select a best model, the next step is to generate forecasts. Then, if possible, evaluate their accuracy on a test set held out from the training data. To see how to setup and run forecasting model evaluation in automated machine learning, see [Orchestrating training, inference, and evaluation](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
2425

2526
## Inference scenarios
2627

27-
In machine learning, inference is the process of generating model predictions for new data not used in training. There are multiple ways to generate predictions in forecasting due to the time dependence of the data. The simplest scenario is when the inference period immediately follows the training period and we generate predictions out to the forecast horizon. This scenario is illustrated in the following diagram:
28+
In machine learning, *inference* is the process of generating model predictions for new data not used in training. There are multiple ways to generate predictions in forecasting due to the time dependence of the data. The simplest scenario is when the inference period immediately follows the training period and you generate predictions out to the forecast horizon. The following diagram illustrates this scenario:
2829

2930
:::image type="content" source="media/concept-automl-forecasting-evaluation/forecast-diagram.png" alt-text="Diagram demonstrating a forecast immediately following the training period.":::
3031

3132
The diagram shows two important inference parameters:
3233

33-
* The **context length**, or the amount of history that the model requires to make a forecast,
34-
* The **forecast horizon**, which is how far ahead in time the forecaster is trained to predict.
34+
- The *context length* is the amount of history that the model requires to make a forecast.
35+
- The *forecast horizon* is how far ahead in time the forecaster is trained to predict.
3536

36-
Forecasting models usually use some historical information, the context, to make predictions ahead in time up to the forecast horizon. **When the context is part of the training data, AutoML saves what it needs to make forecasts**, so there is no need to explicitly provide it.
37+
Forecasting models usually use some historical information, the *context*, to make predictions ahead in time up to the forecast horizon. When the context is part of the training data, AutoML saves what it needs to make forecasts. There's no need to explicitly provide it.
3738

38-
There are two other inference scenarios that are more complicated:
39+
There are two other inference scenarios that are more complicated:
3940

40-
* Generating predictions farther into the future than the forecast horizon,
41-
* Getting predictions when there is a gap between the training and inference periods.
41+
- Generating predictions farther into the future than the forecast horizon
42+
- Getting predictions when there's a gap between the training and inference periods
4243

43-
We review these cases in the following sub-sections.
44+
The following subsections review these cases.
4445

45-
### Prediction past the forecast horizon: recursive forecasting
46+
### Predict past the forecast horizon: recursive forecasting
4647

47-
When you need forecasts past the horizon, AutoML applies the model recursively over the inference period. This means that predictions from the model are _fed back as input_ in order to generate predictions for subsequent forecasting windows. The following diagram shows a simple example:
48+
When you need forecasts past the horizon, AutoML applies the model recursively over the inference period. Predictions from the model are *fed back as input* to generate predictions for subsequent forecasting windows. The following diagram shows a simple example:
4849

4950
:::image type="content" source="media/concept-automl-forecasting-evaluation/recursive-forecast-diagram.png" alt-text="Diagram demonstrating a recursive forecast on a test set.":::
5051

51-
Here, we generate forecasts on a period three times the length of the horizon by using predictions from one window as the context for the next window.
52+
Here, machine learning generates forecasts on a period three times the length of the horizon. It uses predictions from one window as the context for the next window.
5253

5354
> [!WARNING]
54-
> Recursive forecasting compounds modeling errors, so predictions become less accurate the farther they are from the original forecast horizon. You may find a more accurate model by re-training with a longer horizon in this case.
55+
> Recursive forecasting compounds modeling errors. Predictions become less accurate the farther they are from the original forecast horizon. You might find a more accurate model by re-training with a longer horizon.
5556
56-
### Prediction with a gap between training and inference periods
57+
### Predict with a gap between training and inference periods
5758

58-
Suppose that you've trained a model in the past and you want to use it to make predictions from new observations that weren't yet available during training. In this case, there's a time gap between the training and inference periods:
59+
Suppose that after you train a model, you want to use it to make predictions from new observations that weren't yet available during training. In this case, there's a time gap between the training and inference periods:
5960

6061
:::image type="content" source="media/concept-automl-forecasting-evaluation/forecasting-with-gap-diagram.png" alt-text="Diagram demonstrating a forecast with a gap between the training and inference periods.":::
6162

62-
AutoML supports this inference scenario, but **you need to provide the context data in the gap period**, as shown in the diagram. The prediction data passed to the [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines) needs values for features and observed target values in the gap and missing values or "NaN" values for the target in the inference period. The following table shows an example of this pattern:
63-
63+
AutoML supports this inference scenario, but you need to provide the context data in the gap period, as shown in the diagram. The prediction data passed to the [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines) needs values for features and observed target values in the gap and missing values or `NaN` values for the target in the inference period. The following table shows an example of this pattern:
64+
6465
:::image type="content" source="media/concept-automl-forecasting-evaluation/forecasting-with-gap-table.png" alt-text="Table showing an example of prediction data when there's a gap between the training and inference periods.":::
6566

66-
Here, known values of the target and features are provided for 2023-05-01 through 2023-05-03. Missing target values starting at 2023-05-04 indicate that the inference period starts at that date.
67+
Known values of the target and features are provided for `2023-05-01` through `2023-05-03`. Missing target values starting at `2023-05-04` indicate that the inference period starts at that date.
6768

68-
AutoML uses the new context data to update lag and other lookback features, and also to update models like ARIMA that keep an internal state. This operation _doesn't_ update or re-fit model parameters.
69+
AutoML uses the new context data to update lag and other lookback features, and also to update models like ARIMA that keep an internal state. This operation *doesn't* update or refit model parameters.
6970

70-
## Model evaluation
71-
72-
Evaluation is the process of generating predictions on a test set held-out from the training data and computing metrics from these predictions that guide model deployment decisions. Accordingly, there's an inference mode suited for model evaluation - a rolling forecast. We review it in the following subsection.
71+
## <a name="rolling-forecast"></a>Model evaluation
7372

74-
### Rolling forecast
73+
*Evaluation* is the process of generating predictions on a test set held-out from the training data and computing metrics from these predictions that guide model deployment decisions. Accordingly, there's an inference mode suited for model evaluation: a rolling forecast.
7574

76-
A best practice procedure for evaluating a forecasting model is to roll the trained forecaster forward in time over the test set, averaging error metrics over several prediction windows. This procedure is sometimes called a **backtest**, depending on the context. Ideally, the test set for the evaluation is long relative to the model's forecast horizon. Estimates of forecasting error may otherwise be statistically noisy and, therefore, less reliable.
75+
A best practice procedure for evaluating a forecasting model is to roll the trained forecaster forward in time over the test set, averaging error metrics over several prediction windows. This procedure is sometimes called a *backtest*. Ideally, the test set for the evaluation is long relative to the model's forecast horizon. Estimates of forecasting error might otherwise be statistically noisy and, therefore, less reliable.
7776

7877
The following diagram shows a simple example with three forecasting windows:
7978

8079
:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-evaluation-diagram.png" alt-text="Diagram demonstrating a rolling forecast on a test set.":::
8180

8281
The diagram illustrates three rolling evaluation parameters:
8382

84-
* The **context length**, or the amount of history that the model requires to make a forecast,
85-
* The **forecast horizon**, which is how far ahead in time the forecaster is trained to predict,
86-
* The **step size**, which is how far ahead in time the rolling window advances on each iteration on the test set.
83+
- The *context length* is the amount of history that the model requires to make a forecast.
84+
- The *forecast horizon* is how far ahead in time the forecaster is trained to predict.
85+
- The *step size* is how far ahead in time the rolling window advances on each iteration on the test set.
8786

88-
Importantly, the context advances along with the forecasting window. This means that actual values from the test set are used to make forecasts when they fall within the current context window. The latest date of actual values used for a given forecast window is called the **origin time** of the window. The following table shows an example output from the three-window rolling forecast with a horizon of three days and a step size of one day:
87+
The context advances along with the forecasting window. Actual values from the test set are used to make forecasts when they fall within the current context window. The latest date of actual values used for a given forecast window is called the *origin time* of the window. The following table shows an example output from the three-window rolling forecast with a horizon of three days and a step size of one day:
8988

90-
:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-evaluation-table.png" alt-text="Example output table from a rolling forecast.":::
89+
:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-evaluation-table.png" alt-text="Diagram shows example output table from a rolling forecast.":::
9190

92-
With a table like this, we can visualize the forecasts vs. the actuals and compute desired evaluation metrics. AutoML pipelines can generate rolling forecasts on a test set with an [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
91+
With a table like this, you can visualize the forecasts versus the actuals and compute desired evaluation metrics. AutoML pipelines can generate rolling forecasts on a test set with an [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
9392

9493
> [!NOTE]
9594
> When the test period is the same length as the forecast horizon, a rolling forecast gives a single window of forecasts up to the horizon.
9695
9796
## Evaluation metrics
9897

99-
The choice of evaluation summary or metric is usually driven by the specific business scenario. Some common choices include the following:
98+
The specific business scenario usually drives the choice of evaluation summary or metric. Some common choices include the following examples:
10099

101-
* Plots of observed target values vs. forecasted values to check that certain dynamics of the data are captured by the model,
102-
* MAPE (mean absolute percentage error) between actual and forecasted values,
103-
* RMSE (root mean squared error), possibly with a normalization, between actual and forecasted values,
104-
* MAE (mean absolute error), possibly with a normalization, between actual and forecasted values.
100+
- Plots of observed target values versus forecasted values to check that certain dynamics of the data that the model captures
101+
- Mean absolute percentage error (MAPE) between actual and forecasted values
102+
- Root mean squared error (RMSE), possibly with a normalization, between actual and forecasted values
103+
- Mean absolute error (MAE), possibly with a normalization, between actual and forecasted values
105104

106-
There are many other possibilities, depending on the business scenario. You may need to create your own post-processing utilities for computing evaluation metrics from inference results or rolling forecasts. For more information on metrics, see our [regression and forecasting metrics](how-to-understand-automated-ml.md#regressionforecasting-metrics) article section.
105+
There are many other possibilities, depending on the business scenario. You might need to create your own post-processing utilities for computing evaluation metrics from inference results or rolling forecasts. For more information on metrics, see [Regression/forecasting metrics](how-to-understand-automated-ml.md#regressionforecasting-metrics).
107106

108-
## Next steps
107+
## Related content
109108

110-
* Learn more about [how to set up AutoML to train a time-series forecasting model](./how-to-auto-train-forecast.md).
111-
* Learn about [how AutoML uses machine learning to build forecasting models](./concept-automl-forecasting-methods.md).
112-
* Read answers to [frequently asked questions](./how-to-automl-forecasting-faq.md) about forecasting in AutoML.
109+
- Learn more about [how to set up AutoML to train a time-series forecasting model](./how-to-auto-train-forecast.md).
110+
- Learn about [how AutoML uses machine learning to build forecasting models](./concept-automl-forecasting-methods.md).
111+
- Read answers to [frequently asked questions](./how-to-automl-forecasting-faq.md) about forecasting in AutoML.

0 commit comments

Comments
 (0)