You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Build and deploy forecasting models with Azure Machine Learning
@@ -31,7 +31,7 @@ Consult the [package reference documentation](https://aka.ms/aml-packages/foreca
31
31
- An Azure Machine Learning Model Management account
32
32
- Azure Machine Learning Workbench installed
33
33
34
-
If these three are not yet created or installed, follow the [Azure Machine Learning Quickstart and Workbench installation](../service/quickstart-installation.md) article.
34
+
If these three are not yet created or installed, follow the [Azure Machine Learning Quickstart and Workbench installation](../service/quickstart-installation.md) article.
35
35
36
36
1. The Azure Machine Learning Package for Forecasting must be installed. Learn how to [install this package here](https://aka.ms/aml-packages/forecasting).
37
37
@@ -72,19 +72,20 @@ import pkg_resources
72
72
from datetime import timedelta
73
73
import matplotlib
74
74
matplotlib.use('agg')
75
+
%matplotlib inline
75
76
from matplotlib import pyplot as plt
76
77
77
78
from sklearn.linear_model import Lasso, ElasticNet
78
79
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
79
80
from sklearn.neighbors import KNeighborsRegressor
80
81
81
82
from ftk import TimeSeriesDataFrame, ForecastDataFrame, AzureMLForecastPipeline
82
-
from ftk.tsutilsimport last_n_periods_split
83
+
from ftk.ts_utilsimport last_n_periods_split
83
84
84
85
from ftk.transforms import TimeSeriesImputer, TimeIndexFeaturizer, DropColumns
85
86
from ftk.transforms.grain_index_featurizer import GrainIndexFeaturizer
86
-
from ftk.models import Arima, SeasonalNaive, Naive, RegressionForecaster, ETS
87
-
from ftk.models.forecasterunionimport ForecasterUnion
87
+
from ftk.models import Arima, SeasonalNaive, Naive, RegressionForecaster, ETS, BestOfForecaster
88
+
from ftk.models.forecaster_unionimport ForecasterUnion
88
89
from ftk.model_selection import TSGridSearchCV, RollingOriginValidator
89
90
90
91
from azuremltkbase.deployment import AMLSettings
@@ -497,12 +498,11 @@ The [TimeSeriesDataFrame.ts_report](https://docs.microsoft.com/en-us/python/api/
497
498
498
499
499
500
```python
500
-
%matplotlib inline
501
501
whole_tsdf.ts_report()
502
502
```
503
503
504
504
-------------------------------- Data Overview ---------------------------------
Start by splitting the data into training set and a testing set with the [ftk.tsutils.last_n_periods_split](https://docs.microsoft.com/en-us/python/api/ftk.ts_utils?view=azure-ml-py-latest) utility function. The resulting testing set contains the last 40 observations of each time series.
884
+
Start by splitting the data into training set and a testing set with the [last_n_periods_split](https://docs.microsoft.com/en-us/python/api/ftk.ts_utils?view=azure-ml-py-latest) utility function. The resulting testing set contains the last 40 observations of each time series.
You can see that most of the series (213 out of 249) are irregular. An [imputation transform](https://docs.microsoft.com/en-us/python/api/ftk.transforms.ts_imputer?view=azure-ml-py-latest) is required to fill in missing sales quantity values. While there are many imputation options, the following sample code uses a linear interpolation.
966
+
You can see that most of the series (213 out of 249) are irregular. An [imputation transform](https://docs.microsoft.com/en-us/python/api/ftk.transforms.ts_imputer.timeseriesimputer?view=azure-ml-py-latest) is required to fill in missing sales quantity values. While there are many imputation options, the following sample code uses a linear interpolation.
The [ForecasterUnion](https://docs.microsoft.com/en-us/python/api/ftk.models.forecaster_union.forecasterunion?view=azure-ml-py-latest) estimator allows you to combine multiple estimators and fit/predict on them using one line of code.
1032
+
The [ForecasterUnion](https://docs.microsoft.com/en-us/python/api/ftk.models.forecaster_union?view=azure-ml-py-latest) estimator allows you to combine multiple estimators and fit/predict on them using one line of code.
Some machine learning models were able to take advantage of the added features and the similarities between series to get better forecast accuracy.
1367
1361
1368
-
**Cross-Validationand Parameter Sweeping**
1362
+
### CrossValidation, Parameter, and Model Sweeping
1369
1363
1370
-
The package adapts some traditional machine learning functions for a forecasting application. [RollingOriginValidator](https://docs.microsoft.com/python/api/ftk.model_selection.cross_validation.rollingoriginvalidator) does cross-validation temporally, respecting what would and would not be known in a forecasting framework.
1364
+
The package adapts some traditional machine learning functions for a forecasting application. [RollingOriginValidator](https://docs.microsoft.com/python/api/ftk.model_selection.cross_validation.rollingoriginvalidator?view=azure-ml-py-latest) does cross-validation temporally, respecting what would and would not be known in a forecasting framework.
1371
1365
1372
1366
In the figure below, each square represents data from one time point. The blue squares represent training and orange squares represent testing in each fold. Testing data must come from the time points after the largest training time point. Otherwise, future data is leaked into training data causing the model evaluation to become invalid.
The [TSGridSearchCV](https://docs.microsoft.com/en-us/python/api/ftk.model_selection.search.tsgridsearchcv?view=azure-ml-py-latest) class exhaustively searches over specified parameter values and uses `RollingOriginValidator` to evaluate parameter performance in order to find the best parameters.
1371
+
1372
+
1376
1373
```python
1377
1374
# Set up the `RollingOriginValidator` to do 2 folds of rolling origin cross-validation
The `BestOfForecaster` class selects the model with the best performance from a list of given models. Similar to `TSGridSearchCV`, it also uses RollingOriginValidator for cross validation and performance evaluation.
1393
+
Here we pass a list of two models to demonstrate the usage of `BestOfForecaster`
Now that you have identified the best model, you can build and fit your final pipeline with all transformers and the best model.
1396
1489
@@ -1411,9 +1504,62 @@ print('Median of APE of final pipeline: {0}'.format(final_median_ape))
1411
1504
Median of APE of final pipeline: 42.54336821266968
1412
1505
1413
1506
1414
-
## Operationalization: deploy and consume
1507
+
## Visualization
1508
+
The `ForecastDataFrame` class provides plotting functions for visualizing and analyzing forecasting results. Use the commonly used charts with your data. Please see the sample notebook below on plotting functions for all the functions available.
1509
+
1510
+
The `show_error` function plots performance metrics aggregated by an arbitrary column. By default, the `show_error` function aggregates by the `grain_colnames` of the `ForecastDataFrame`. It's often useful to identify the grains/groups with the best or worst performance, especially when you have a large number of time series. The `performance_percent` argument of `show_error` allows you to specify a performance interval and plot the error of a subset of grains/groups.
1511
+
1512
+
Plot the grains with the bottom 5% performance, i.e. top 5% MedianAPE
Once you have an idea of the overall performance, you may want to explore individual grains, especially those that performed poorly. The `plot_forecast_by_grain` method plots forecast vs. actual of specified grains. Here, we plot the grain with the best performance and the grain with the worst performance discovered in the `show_error` plot.
For a deeper dive on the major features of AMLPF, please refer to the following notebooks with more details and examples of each feature:
1551
+
[Notebook on TimeSeriesDataFrame](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/Introduction_to_TimeSeriesDataFrames.ipynb)
1552
+
[Notebook on Data Wrangling](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/Data_Wrangling_Sample.ipynb)
1553
+
[Notebook on Transformers](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/Forecast_Package_Transforms.ipynb)
1554
+
[Notebook on Models](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/AMLPF_models_sample_notebook.ipynb)
1555
+
[Notebook on Cross Validation](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/Time_Series_Cross_Validation.ipynb)
1556
+
[Notebook on Lag Transformer and OriginTime](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/Constructing_Lags_and_Explaining_Origin_Times.ipynb)
1557
+
[Notebook on Plotting Functions](https://azuremlftkrelease.blob.core.windows.net/samples/feature_notebooks/Plotting_Functions_in_AMLPF.ipynb)
1558
+
1559
+
## Operationalization
1415
1560
1416
-
In this section, you deploy a pipeline as an Azure Machine Learning web service and consume it for training and scoring. Scoring the deployed web service retrains the model and generates forecasts on new data.
1561
+
In this section, you deploy a pipeline as an Azure Machine Learning web service and consume it for training and scoring.
1562
+
Currently, only pipelines there are not fitted are supported for deployment. Scoring the deployed web service retrains the model and generates forecasts on new data.
0 commit comments