Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
ed3c3e1
First commit
htahir1 May 13, 2025
5754185
First commit
htahir1 May 13, 2025
3e4c1de
First commit
htahir1 May 13, 2025
1befbeb
Update model evaluation and forecasting outputs
htahir1 May 13, 2025
a753ad6
Changeds
htahir1 May 14, 2025
c11786c
Prophet
htahir1 May 15, 2025
04dd143
Update retail forecast pipelines and steps
htahir1 May 15, 2025
f9e12fe
Run retail forecasting inference pipeline with trained models
htahir1 May 15, 2025
4a4efbb
Remove training-specific parameters in run.py
htahir1 May 15, 2025
470a961
Add RetailForecast production-ready sales forecasting pipeline
htahir1 May 15, 2025
e71137f
added data
htahir1 May 15, 2025
09fb8dc
Update ZenML version to allow minor upgrades
htahir1 May 15, 2025
70894ef
Screenshots
htahir1 May 15, 2025
b4c3eec
Create data visualization step for retail sales analysis
htahir1 May 15, 2025
686093e
Refactor data visualization functions and pipelines
htahir1 May 15, 2025
08fbc1d
Reorganize imports and update docstrings
htahir1 May 15, 2025
cf36f3c
Add forecasting and statistical terms to .typos.toml
htahir1 May 15, 2025
70238b4
Update README with Inference pipeline steps
htahir1 May 15, 2025
a6c79d0
Add Dockerfile for retail-forecast project
htahir1 May 15, 2025
fa12c8e
Update retail-forecast/README.md
htahir1 May 15, 2025
53e64ea
Update retail-forecast/README.md
htahir1 May 15, 2025
042d9b8
Add ProphetMaterializer for model storage
htahir1 May 15, 2025
d919c4e
Remove unused options from main function
htahir1 May 15, 2025
515504c
Add custom ProphetMaterializer for model storage
htahir1 May 15, 2025
827ff2f
Update retail-forecast/README.md
htahir1 May 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,12 @@ preprocessor = "preprocessor"
logits = "logits"
analyse = "analyse"
Labour = "Labour"
# Forecasting and statistical terms
MAPE = "MAPE"
mape = "mape"
yhat = "yhat"
yhat_lower = "yhat_lower"
yhat_upper = "yhat_upper"

[default]
locale = "en-us"
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ etc.
| [Huggingface to Sagemaker](huggingface-sagemaker) | 🚀 MLOps | 🔄 CI/CD, 📦 Deployment | mlflow, sagemaker, kubeflow |
| [Databricks Production QA](databricks-production-qa-demo) | 🚀 MLOps | 📊 Monitoring, 🔍 Quality Assurance | databricks, evidently, shap |
| [Eurorate Predictor](eurorate-predictor) | 📊 Data | ⏱️ Time Series, 🔄 ETL | airflow, bigquery, xgboost |
| [RetailForecast](retail-forecast) | 📊 Data | ⏱️ Time Series, 📈 Forecasting, 🔮 Multi-Model | prophet, zenml, pandas |

# 💻 System Requirements

Expand Down
42 changes: 42 additions & 0 deletions retail-forecast/Dockerfile.codespace
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Sandbox base image
FROM zenmldocker/zenml-sandbox:latest

# Install uv from official distroless image
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Set uv environment variables for optimization
ENV UV_SYSTEM_PYTHON=1
ENV UV_COMPILE_BYTECODE=1

# Project metadata
LABEL project_name="retail-forecast"
LABEL project_version="0.1.0"

# Install dependencies with uv and cache optimization
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --system \
"zenml>=0.82.0" \
"numpy>=1.20.0" \
"pandas>=1.3.0" \
"matplotlib>=3.5.0" \
"prophet>=1.1.0" \
"typing_extensions>=4.0.0" \
"pyarrow" \
"fastparquet" \
"plotly" \
"notebook"

# Set workspace directory
WORKDIR /workspace

# Clone only the project directory and reorganize
RUN git clone --depth 1 https://github.com/zenml-io/zenml-projects.git /tmp/zenml-projects && \
cp -r /tmp/zenml-projects/retail-forecast/* /workspace/ && \
rm -rf /tmp/zenml-projects

# VSCode settings
RUN mkdir -p /workspace/.vscode && \
printf '{\n "workbench.colorTheme": "Default Dark Modern"\n}' > /workspace/.vscode/settings.json

# Create assets directory for visualizations
RUN mkdir -p /workspace/assets
210 changes: 210 additions & 0 deletions retail-forecast/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# RetailForecast: Production-Ready Sales Forecasting with ZenML and Prophet

A robust MLOps pipeline for retail sales forecasting designed for retail data scientists and ML engineers.

## 📊 Business Context

In retail, accurate demand forecasting is critical for optimizing inventory, staff scheduling, and financial planning. This project provides a production-ready sales forecasting solution that can be immediately deployed in retail environments to:

- Predict future sales volumes across multiple stores and products
- Capture seasonal patterns and trends in customer purchasing behavior
- Support data-driven inventory management and purchasing decisions
- Provide actionable insights through visual forecasting dashboards

<div align="center">
<br/>
<img alt="Forecast Dashboard" src="assets/forecast_dashboard.png" width="70%">
<br/>
<p><em>HTML dashboard visualization showing forecasts with uncertainty intervals</em></p>
</div>

## 🔍 Data Overview

The pipeline works with time-series retail sales data structured as follows:

| Field | Description |
|-------|-------------|
| date | Date of sales record (YYYY-MM-DD) |
| store | Store identifier (e.g., Store_1, Store_2) |
| item | Product identifier (e.g., Item_A, Item_B) |
| sales | Number of units sold |
| price | Unit price |

The system automatically handles:
- Multiple store/item combinations as separate time series
- Train/test splitting for model validation
- Proper data transformations required by Prophet
- Missing value imputation and outlier detection

<div align="center">
<br/>
<img alt="Data Visualization" src="assets/data_visualization.gif" width="70%">
<br/>
<p><em>Interactive visualization of historical sales patterns</em></p>
</div>

## 🚀 Pipeline Architecture

The project includes two primary pipelines:

### 1. Training Pipeline

The training pipeline performs the following steps:

1. **Data Loading**: Imports historical sales data from CSV files
2. **Data Preprocessing**:
- Transforms data into Prophet-compatible format
- Creates separate time series for each store-item combination
- Performs train/test splitting based on configurable ratio
3. **Model Training**:
- Trains multiple Facebook Prophet models simultaneously, one for each store-item combination
- Configures seasonality parameters based on domain knowledge
- Handles price changes as regressors when available
4. **Model Evaluation**:
- Calculates MAPE, RMSE, and MAE metrics on test data
- Generates visual diagnostics for model performance
5. **Forecasting**:
- Produces forecasts with uncertainty intervals
- Creates interactive HTML visualizations

<div align="center">
<br/>
<img alt="Training Pipeline DAG" src="assets/training_pipeline.png" width="70%">
<br/>
<p><em>ZenML visualization of the training pipeline DAG</em></p>
</div>

### 2. Inference Pipeline

The inference pipeline enables fast forecasting with pre-trained models:

1. **Data Loading**: Imports the most recent sales data
2. **Data Preprocessing**: Transforms data into Prophet format
3. **Forecasting**: Generates predictions using production models
4. **Visualization**: Creates interactive dashboards with forecasts

<div align="center">
<br/>
<img alt="Inference Pipeline DAG" src="assets/inference_pipeline.png" width="70%">
<br/>
<p><em>ZenML visualization of the inference pipeline DAG</em></p>
</div>

## 📈 Model Details

The forecasting solution uses Facebook Prophet, chosen specifically for its combination of accuracy and simplicity in retail forecasting scenarios:

- **Multiple Models Approach**: Rather than a one-size-fits-all model, we generate individual Prophet models for each store-item combination, allowing forecasts that capture the unique patterns of each product in each location
- **Components**: Prophet automatically decomposes time series into trend, seasonality, and holidays
- **Seasonality**: Captures weekly, monthly, and yearly patterns in sales data
- **Special Events**: Handles holidays and promotions as custom seasonality effects
- **Uncertainty Estimation**: Provides prediction intervals for better inventory planning
- **Extensibility**: Supports additional regressors like price and marketing spend

Prophet was selected for this solution because it excels at:
- Handling missing data and outliers common in retail sales data
- Automatically detecting seasonal patterns without extensive feature engineering
- Providing intuitive parameters that business users can understand
- Scaling to thousands of individual time series efficiently


## 💻 Technical Implementation

The project leverages ZenML's MLOps framework to provide:

- **Model Versioning**: Track all model versions and their performance metrics
- **Reproducibility**: All experiments are fully reproducible with tracked parameters
- **Pipeline Caching**: Speed up experimentation with intelligent caching of pipeline steps
- **Artifact Tracking**: All data and models are properly versioned and stored
- **Deployment Ready**: Models can be directly deployed to production environments

A key innovation in this project is the custom ProphetMaterializer that enables:
- Serialization/deserialization of Prophet models for ZenML artifact storage
- Handling dictionaries of multiple Prophet models in a single artifact
- Efficient model loading for batch inference at scale

<div align="center">
<br/>
<img alt="ZenML Dashboard" src="assets/zenml_dashboard.png" width="70%">
<br/>
<p><em>ZenML model registry tracking model versions and performance</em></p>
</div>

## 🛠️ Getting Started

### Prerequisites

- Python 3.8+
- ZenML installed and configured

### Installation

```bash
# Clone the repository
git clone https://github.com/zenml-io/zenml-projects.git
cd zenml-projects/retail-forecast

# Install dependencies
pip install -r requirements.txt

# Initialize ZenML (if needed)
zenml init
```

### Running the Pipelines

To train models and generate forecasts:

```bash
# Run the training pipeline (default)
python run.py

# Run with custom parameters
python run.py --forecast-periods 60 --test-size 0.3 --weekly-seasonality True
```

To make predictions using existing models:

```bash
# Run the inference pipeline
python run.py --inference
```

### Viewing Results

Start the ZenML dashboard:

```bash
zenml login
```

Navigate to the dashboard to explore:
- Pipeline runs and their status
- Model performance metrics
- Interactive forecast visualizations
- Version history of all models

## 🔄 Integration with Retail Systems

This solution can be integrated with existing retail systems:

- **Inventory Management**: Connect forecasts to automatic reordering systems
- **ERP Systems**: Feed forecasts into financial planning modules
- **BI Dashboards**: Export forecasts to Tableau, Power BI, or similar tools
- **Supply Chain**: Share forecasts with suppliers via API endpoints

## 📊 Example Use Case: Store-Level Demand Planning

A retail chain with 50 stores and 500 products uses this pipeline to:

1. Train models on 2 years of historical sales data
2. Generate daily forecasts for the next 30 days for each store-item combination
3. Aggregate forecasts to support central purchasing decisions
4. Update models weekly with new sales data

The result: 15% reduction in stockouts and 20% decrease in excess inventory.


## 📄 License

This project is licensed under the Apache License 2.0.
Binary file added retail-forecast/assets/data_visualization.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added retail-forecast/assets/forecast_dashboard.png
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this image could have a bit more of the zenml dashboard visible just so we reinforce the fact that it's visualizations from within zenml? or a GIF perhaps showing that this is interactive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or actually I see you added a GIF below too. Maybe leave it then.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added retail-forecast/assets/inference_pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added retail-forecast/assets/training_pipeline.png
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing actionable, but just noting that our DAG visualizer starts to look a bit scruffy when the DAG is complex.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added retail-forecast/assets/zenml_dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
91 changes: 91 additions & 0 deletions retail-forecast/data/calendar.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
date,weekday,month,is_weekend,is_holiday,is_promo
2024-01-01,0,1,0,1,0
2024-01-02,1,1,0,0,0
2024-01-03,2,1,0,0,0
2024-01-04,3,1,0,0,0
2024-01-05,4,1,0,0,0
2024-01-06,5,1,1,0,0
2024-01-07,6,1,1,0,0
2024-01-08,0,1,0,0,0
2024-01-09,1,1,0,0,0
2024-01-10,2,1,0,0,1
2024-01-11,3,1,0,0,1
2024-01-12,4,1,0,0,1
2024-01-13,5,1,1,0,1
2024-01-14,6,1,1,0,1
2024-01-15,0,1,0,1,1
2024-01-16,1,1,0,0,1
2024-01-17,2,1,0,0,1
2024-01-18,3,1,0,0,1
2024-01-19,4,1,0,0,1
2024-01-20,5,1,1,0,1
2024-01-21,6,1,1,0,0
2024-01-22,0,1,0,0,0
2024-01-23,1,1,0,0,0
2024-01-24,2,1,0,0,0
2024-01-25,3,1,0,0,0
2024-01-26,4,1,0,0,0
2024-01-27,5,1,1,0,0
2024-01-28,6,1,1,0,0
2024-01-29,0,1,0,0,0
2024-01-30,1,1,0,0,0
2024-01-31,2,1,0,0,0
2024-02-01,3,2,0,1,0
2024-02-02,4,2,0,0,0
2024-02-03,5,2,1,0,0
2024-02-04,6,2,1,0,0
2024-02-05,0,2,0,0,0
2024-02-06,1,2,0,0,0
2024-02-07,2,2,0,0,0
2024-02-08,3,2,0,0,0
2024-02-09,4,2,0,0,0
2024-02-10,5,2,1,0,1
2024-02-11,6,2,1,0,1
2024-02-12,0,2,0,0,1
2024-02-13,1,2,0,0,1
2024-02-14,2,2,0,0,1
2024-02-15,3,2,0,1,1
2024-02-16,4,2,0,0,1
2024-02-17,5,2,1,0,1
2024-02-18,6,2,1,0,1
2024-02-19,0,2,0,0,1
2024-02-20,1,2,0,0,1
2024-02-21,2,2,0,0,0
2024-02-22,3,2,0,0,0
2024-02-23,4,2,0,0,0
2024-02-24,5,2,1,0,0
2024-02-25,6,2,1,0,0
2024-02-26,0,2,0,0,0
2024-02-27,1,2,0,0,0
2024-02-28,2,2,0,0,0
2024-02-29,3,2,0,0,0
2024-03-01,4,3,0,1,0
2024-03-02,5,3,1,0,0
2024-03-03,6,3,1,0,0
2024-03-04,0,3,0,0,0
2024-03-05,1,3,0,0,0
2024-03-06,2,3,0,0,0
2024-03-07,3,3,0,0,0
2024-03-08,4,3,0,0,0
2024-03-09,5,3,1,0,0
2024-03-10,6,3,1,0,1
2024-03-11,0,3,0,0,1
2024-03-12,1,3,0,0,1
2024-03-13,2,3,0,0,1
2024-03-14,3,3,0,0,1
2024-03-15,4,3,0,1,1
2024-03-16,5,3,1,0,1
2024-03-17,6,3,1,0,1
2024-03-18,0,3,0,0,1
2024-03-19,1,3,0,0,1
2024-03-20,2,3,0,0,1
2024-03-21,3,3,0,0,0
2024-03-22,4,3,0,0,0
2024-03-23,5,3,1,0,0
2024-03-24,6,3,1,0,0
2024-03-25,0,3,0,0,0
2024-03-26,1,3,0,0,0
2024-03-27,2,3,0,0,0
2024-03-28,3,3,0,0,0
2024-03-29,4,3,0,0,0
2024-03-30,5,3,1,0,0
Loading
Loading