Skip to content

Commit fb4ea5a

Browse files
authored
Merge pull request #204 from zenml-io/project/forecastreader
Forecast reader
2 parents c7e3927 + 827ff2f commit fb4ea5a

25 files changed

+3778
-1
lines changed

.typos.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,12 @@ preprocessor = "preprocessor"
5050
logits = "logits"
5151
analyse = "analyse"
5252
Labour = "Labour"
53+
# Forecasting and statistical terms
54+
MAPE = "MAPE"
55+
mape = "mape"
56+
yhat = "yhat"
57+
yhat_lower = "yhat_lower"
58+
yhat_upper = "yhat_upper"
5359

5460
[default]
5561
locale = "en-us"

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ etc.
7171
| [Huggingface to Sagemaker](huggingface-sagemaker) | 🚀 MLOps | 🔄 CI/CD, 📦 Deployment | mlflow, sagemaker, kubeflow |
7272
| [Databricks Production QA](databricks-production-qa-demo) | 🚀 MLOps | 📊 Monitoring, 🔍 Quality Assurance | databricks, evidently, shap |
7373
| [Eurorate Predictor](eurorate-predictor) | 📊 Data | ⏱️ Time Series, 🔄 ETL | airflow, bigquery, xgboost |
74+
| [RetailForecast](retail-forecast) | 📊 Data | ⏱️ Time Series, 📈 Forecasting, 🔮 Multi-Model | prophet, zenml, pandas |
7475

7576
# 💻 System Requirements
7677

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Sandbox base image
2+
FROM zenmldocker/zenml-sandbox:latest
3+
4+
# Install uv from official distroless image
5+
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
6+
7+
# Set uv environment variables for optimization
8+
ENV UV_SYSTEM_PYTHON=1
9+
ENV UV_COMPILE_BYTECODE=1
10+
11+
# Project metadata
12+
LABEL project_name="retail-forecast"
13+
LABEL project_version="0.1.0"
14+
15+
# Install dependencies with uv and cache optimization
16+
RUN --mount=type=cache,target=/root/.cache/uv \
17+
uv pip install --system \
18+
"zenml>=0.82.0" \
19+
"numpy>=1.20.0" \
20+
"pandas>=1.3.0" \
21+
"matplotlib>=3.5.0" \
22+
"prophet>=1.1.0" \
23+
"typing_extensions>=4.0.0" \
24+
"pyarrow" \
25+
"fastparquet" \
26+
"plotly" \
27+
"notebook"
28+
29+
# Set workspace directory
30+
WORKDIR /workspace
31+
32+
# Clone only the project directory and reorganize
33+
RUN git clone --depth 1 https://github.com/zenml-io/zenml-projects.git /tmp/zenml-projects && \
34+
cp -r /tmp/zenml-projects/retail-forecast/* /workspace/ && \
35+
rm -rf /tmp/zenml-projects
36+
37+
# VSCode settings
38+
RUN mkdir -p /workspace/.vscode && \
39+
printf '{\n "workbench.colorTheme": "Default Dark Modern"\n}' > /workspace/.vscode/settings.json
40+
41+
# Create assets directory for visualizations
42+
RUN mkdir -p /workspace/assets

retail-forecast/README.md

Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
# RetailForecast: Production-Ready Sales Forecasting with ZenML and Prophet
2+
3+
A robust MLOps pipeline for retail sales forecasting designed for retail data scientists and ML engineers.
4+
5+
## 📊 Business Context
6+
7+
In retail, accurate demand forecasting is critical for optimizing inventory, staff scheduling, and financial planning. This project provides a production-ready sales forecasting solution that can be immediately deployed in retail environments to:
8+
9+
- Predict future sales volumes across multiple stores and products
10+
- Capture seasonal patterns and trends in customer purchasing behavior
11+
- Support data-driven inventory management and purchasing decisions
12+
- Provide actionable insights through visual forecasting dashboards
13+
14+
<div align="center">
15+
<br/>
16+
<img alt="Forecast Dashboard" src="assets/forecast_dashboard.png" width="70%">
17+
<br/>
18+
<p><em>HTML dashboard visualization showing forecasts with uncertainty intervals</em></p>
19+
</div>
20+
21+
## 🔍 Data Overview
22+
23+
The pipeline works with time-series retail sales data structured as follows:
24+
25+
| Field | Description |
26+
|-------|-------------|
27+
| date | Date of sales record (YYYY-MM-DD) |
28+
| store | Store identifier (e.g., Store_1, Store_2) |
29+
| item | Product identifier (e.g., Item_A, Item_B) |
30+
| sales | Number of units sold |
31+
| price | Unit price |
32+
33+
The system automatically handles:
34+
- Multiple store/item combinations as separate time series
35+
- Train/test splitting for model validation
36+
- Proper data transformations required by Prophet
37+
- Missing value imputation and outlier detection
38+
39+
<div align="center">
40+
<br/>
41+
<img alt="Data Visualization" src="assets/data_visualization.gif" width="70%">
42+
<br/>
43+
<p><em>Interactive visualization of historical sales patterns</em></p>
44+
</div>
45+
46+
## 🚀 Pipeline Architecture
47+
48+
The project includes two primary pipelines:
49+
50+
### 1. Training Pipeline
51+
52+
The training pipeline performs the following steps:
53+
54+
1. **Data Loading**: Imports historical sales data from CSV files
55+
2. **Data Preprocessing**:
56+
- Transforms data into Prophet-compatible format
57+
- Creates separate time series for each store-item combination
58+
- Performs train/test splitting based on configurable ratio
59+
3. **Model Training**:
60+
- Trains multiple Facebook Prophet models simultaneously, one for each store-item combination
61+
- Configures seasonality parameters based on domain knowledge
62+
- Handles price changes as regressors when available
63+
4. **Model Evaluation**:
64+
- Calculates MAPE, RMSE, and MAE metrics on test data
65+
- Generates visual diagnostics for model performance
66+
5. **Forecasting**:
67+
- Produces forecasts with uncertainty intervals
68+
- Creates interactive HTML visualizations
69+
70+
<div align="center">
71+
<br/>
72+
<img alt="Training Pipeline DAG" src="assets/training_pipeline.png" width="70%">
73+
<br/>
74+
<p><em>ZenML visualization of the training pipeline DAG</em></p>
75+
</div>
76+
77+
### 2. Inference Pipeline
78+
79+
The inference pipeline enables fast forecasting with pre-trained models:
80+
81+
1. **Data Loading**: Imports the most recent sales data
82+
2. **Data Preprocessing**: Transforms data into Prophet format
83+
3. **Forecasting**: Generates predictions using production models
84+
4. **Visualization**: Creates interactive dashboards with forecasts
85+
86+
<div align="center">
87+
<br/>
88+
<img alt="Inference Pipeline DAG" src="assets/inference_pipeline.png" width="70%">
89+
<br/>
90+
<p><em>ZenML visualization of the inference pipeline DAG</em></p>
91+
</div>
92+
93+
## 📈 Model Details
94+
95+
The forecasting solution uses [Facebook Prophet](https://github.com/facebook/prophet), chosen specifically for its combination of accuracy and simplicity in retail forecasting scenarios:
96+
97+
- **Multiple Models Approach**: Rather than a one-size-fits-all model, we generate individual Prophet models for each store-item combination, allowing forecasts that capture the unique patterns of each product in each location
98+
- **Components**: Prophet automatically decomposes time series into trend, seasonality, and holidays
99+
- **Seasonality**: Captures weekly, monthly, and yearly patterns in sales data
100+
- **Special Events**: Handles holidays and promotions as custom seasonality effects
101+
- **Uncertainty Estimation**: Provides prediction intervals for better inventory planning
102+
- **Extensibility**: Supports additional regressors like price and marketing spend
103+
104+
Prophet was selected for this solution because it excels at:
105+
- Handling missing data and outliers common in retail sales data
106+
- Automatically detecting seasonal patterns without extensive feature engineering
107+
- Providing intuitive parameters that business users can understand
108+
- Scaling to thousands of individual time series efficiently
109+
110+
111+
## 💻 Technical Implementation
112+
113+
The project leverages ZenML's MLOps framework to provide:
114+
115+
- **Model Versioning**: Track all model versions and their performance metrics
116+
- **Reproducibility**: All experiments are fully reproducible with tracked parameters
117+
- **Pipeline Caching**: Speed up experimentation with intelligent caching of pipeline steps
118+
- **Artifact Tracking**: All data and models are properly versioned and stored
119+
- **Deployment Ready**: Models can be directly deployed to production environments
120+
121+
A key innovation in this project is the custom ProphetMaterializer that enables serialization/deserialization of Prophet models for ZenML artifact storage.
122+
123+
<div align="center">
124+
<br/>
125+
<img alt="ZenML Dashboard" src="assets/zenml_dashboard.png" width="70%">
126+
<br/>
127+
<p><em>ZenML model registry tracking model versions and performance</em></p>
128+
</div>
129+
130+
## 🛠️ Getting Started
131+
132+
### Prerequisites
133+
134+
- Python 3.9+
135+
- ZenML installed and configured
136+
137+
### Installation
138+
139+
```bash
140+
# Clone the repository
141+
git clone https://github.com/zenml-io/zenml-projects.git
142+
cd zenml-projects/retail-forecast
143+
144+
# Install dependencies
145+
pip install -r requirements.txt
146+
147+
# Initialize ZenML (if needed)
148+
zenml init
149+
```
150+
151+
### Running the Pipelines
152+
153+
To train models and generate forecasts:
154+
155+
```bash
156+
# Run the training pipeline (default)
157+
python run.py
158+
159+
# Run with custom parameters
160+
python run.py --forecast-periods 60 --test-size 0.3 --weekly-seasonality True
161+
```
162+
163+
To make predictions using existing models:
164+
165+
```bash
166+
# Run the inference pipeline
167+
python run.py --inference
168+
```
169+
170+
### Viewing Results
171+
172+
Start the ZenML dashboard:
173+
174+
```bash
175+
zenml login
176+
```
177+
178+
Navigate to the dashboard to explore:
179+
- Pipeline runs and their status
180+
- Model performance metrics
181+
- Interactive forecast visualizations
182+
- Version history of all models
183+
184+
## 🔄 Integration with Retail Systems
185+
186+
This solution can be integrated with existing retail systems:
187+
188+
- **Inventory Management**: Connect forecasts to automatic reordering systems
189+
- **ERP Systems**: Feed forecasts into financial planning modules
190+
- **BI Dashboards**: Export forecasts to Tableau, Power BI, or similar tools
191+
- **Supply Chain**: Share forecasts with suppliers via API endpoints
192+
193+
## 📊 Example Use Case: Store-Level Demand Planning
194+
195+
A retail chain with 50 stores and 500 products uses this pipeline to:
196+
197+
1. Train models on 2 years of historical sales data
198+
2. Generate daily forecasts for the next 30 days for each store-item combination
199+
3. Aggregate forecasts to support central purchasing decisions
200+
4. Update models weekly with new sales data
201+
202+
The result: 15% reduction in stockouts and 20% decrease in excess inventory.
203+
204+
205+
## 📄 License
206+
207+
This project is licensed under the Apache License 2.0.
6.53 MB
Loading
185 KB
Loading
99 KB
Loading
97.3 KB
Loading
152 KB
Loading

retail-forecast/data/calendar.csv

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
date,weekday,month,is_weekend,is_holiday,is_promo
2+
2024-01-01,0,1,0,1,0
3+
2024-01-02,1,1,0,0,0
4+
2024-01-03,2,1,0,0,0
5+
2024-01-04,3,1,0,0,0
6+
2024-01-05,4,1,0,0,0
7+
2024-01-06,5,1,1,0,0
8+
2024-01-07,6,1,1,0,0
9+
2024-01-08,0,1,0,0,0
10+
2024-01-09,1,1,0,0,0
11+
2024-01-10,2,1,0,0,1
12+
2024-01-11,3,1,0,0,1
13+
2024-01-12,4,1,0,0,1
14+
2024-01-13,5,1,1,0,1
15+
2024-01-14,6,1,1,0,1
16+
2024-01-15,0,1,0,1,1
17+
2024-01-16,1,1,0,0,1
18+
2024-01-17,2,1,0,0,1
19+
2024-01-18,3,1,0,0,1
20+
2024-01-19,4,1,0,0,1
21+
2024-01-20,5,1,1,0,1
22+
2024-01-21,6,1,1,0,0
23+
2024-01-22,0,1,0,0,0
24+
2024-01-23,1,1,0,0,0
25+
2024-01-24,2,1,0,0,0
26+
2024-01-25,3,1,0,0,0
27+
2024-01-26,4,1,0,0,0
28+
2024-01-27,5,1,1,0,0
29+
2024-01-28,6,1,1,0,0
30+
2024-01-29,0,1,0,0,0
31+
2024-01-30,1,1,0,0,0
32+
2024-01-31,2,1,0,0,0
33+
2024-02-01,3,2,0,1,0
34+
2024-02-02,4,2,0,0,0
35+
2024-02-03,5,2,1,0,0
36+
2024-02-04,6,2,1,0,0
37+
2024-02-05,0,2,0,0,0
38+
2024-02-06,1,2,0,0,0
39+
2024-02-07,2,2,0,0,0
40+
2024-02-08,3,2,0,0,0
41+
2024-02-09,4,2,0,0,0
42+
2024-02-10,5,2,1,0,1
43+
2024-02-11,6,2,1,0,1
44+
2024-02-12,0,2,0,0,1
45+
2024-02-13,1,2,0,0,1
46+
2024-02-14,2,2,0,0,1
47+
2024-02-15,3,2,0,1,1
48+
2024-02-16,4,2,0,0,1
49+
2024-02-17,5,2,1,0,1
50+
2024-02-18,6,2,1,0,1
51+
2024-02-19,0,2,0,0,1
52+
2024-02-20,1,2,0,0,1
53+
2024-02-21,2,2,0,0,0
54+
2024-02-22,3,2,0,0,0
55+
2024-02-23,4,2,0,0,0
56+
2024-02-24,5,2,1,0,0
57+
2024-02-25,6,2,1,0,0
58+
2024-02-26,0,2,0,0,0
59+
2024-02-27,1,2,0,0,0
60+
2024-02-28,2,2,0,0,0
61+
2024-02-29,3,2,0,0,0
62+
2024-03-01,4,3,0,1,0
63+
2024-03-02,5,3,1,0,0
64+
2024-03-03,6,3,1,0,0
65+
2024-03-04,0,3,0,0,0
66+
2024-03-05,1,3,0,0,0
67+
2024-03-06,2,3,0,0,0
68+
2024-03-07,3,3,0,0,0
69+
2024-03-08,4,3,0,0,0
70+
2024-03-09,5,3,1,0,0
71+
2024-03-10,6,3,1,0,1
72+
2024-03-11,0,3,0,0,1
73+
2024-03-12,1,3,0,0,1
74+
2024-03-13,2,3,0,0,1
75+
2024-03-14,3,3,0,0,1
76+
2024-03-15,4,3,0,1,1
77+
2024-03-16,5,3,1,0,1
78+
2024-03-17,6,3,1,0,1
79+
2024-03-18,0,3,0,0,1
80+
2024-03-19,1,3,0,0,1
81+
2024-03-20,2,3,0,0,1
82+
2024-03-21,3,3,0,0,0
83+
2024-03-22,4,3,0,0,0
84+
2024-03-23,5,3,1,0,0
85+
2024-03-24,6,3,1,0,0
86+
2024-03-25,0,3,0,0,0
87+
2024-03-26,1,3,0,0,0
88+
2024-03-27,2,3,0,0,0
89+
2024-03-28,3,3,0,0,0
90+
2024-03-29,4,3,0,0,0
91+
2024-03-30,5,3,1,0,0

0 commit comments

Comments
 (0)