-
Notifications
You must be signed in to change notification settings - Fork 70
Forecast reader #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forecast reader #204
Changes from 19 commits
ed3c3e1
5754185
3e4c1de
1befbeb
a753ad6
c11786c
04dd143
f9e12fe
4a4efbb
470a961
e71137f
09fb8dc
70894ef
b4c3eec
686093e
08fbc1d
cf36f3c
70238b4
a6c79d0
fa12c8e
53e64ea
042d9b8
d919c4e
515504c
827ff2f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| # Sandbox base image | ||
| FROM zenmldocker/zenml-sandbox:latest | ||
|
|
||
| # Install uv from official distroless image | ||
| COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ | ||
|
|
||
| # Set uv environment variables for optimization | ||
| ENV UV_SYSTEM_PYTHON=1 | ||
| ENV UV_COMPILE_BYTECODE=1 | ||
|
|
||
| # Project metadata | ||
| LABEL project_name="retail-forecast" | ||
| LABEL project_version="0.1.0" | ||
|
|
||
| # Install dependencies with uv and cache optimization | ||
| RUN --mount=type=cache,target=/root/.cache/uv \ | ||
| uv pip install --system \ | ||
| "zenml>=0.82.0" \ | ||
| "numpy>=1.20.0" \ | ||
| "pandas>=1.3.0" \ | ||
| "matplotlib>=3.5.0" \ | ||
| "prophet>=1.1.0" \ | ||
| "typing_extensions>=4.0.0" \ | ||
| "pyarrow" \ | ||
| "fastparquet" \ | ||
| "plotly" \ | ||
| "notebook" | ||
|
|
||
| # Set workspace directory | ||
| WORKDIR /workspace | ||
|
|
||
| # Clone only the project directory and reorganize | ||
| RUN git clone --depth 1 https://github.com/zenml-io/zenml-projects.git /tmp/zenml-projects && \ | ||
| cp -r /tmp/zenml-projects/retail-forecast/* /workspace/ && \ | ||
| rm -rf /tmp/zenml-projects | ||
|
|
||
| # VSCode settings | ||
| RUN mkdir -p /workspace/.vscode && \ | ||
| printf '{\n "workbench.colorTheme": "Default Dark Modern"\n}' > /workspace/.vscode/settings.json | ||
|
|
||
| # Create assets directory for visualizations | ||
| RUN mkdir -p /workspace/assets |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,210 @@ | ||
| # RetailForecast: Production-Ready Sales Forecasting with ZenML and Prophet | ||
|
|
||
| A robust MLOps pipeline for retail sales forecasting designed for retail data scientists and ML engineers. | ||
|
|
||
| ## 📊 Business Context | ||
|
|
||
| In retail, accurate demand forecasting is critical for optimizing inventory, staff scheduling, and financial planning. This project provides a production-ready sales forecasting solution that can be immediately deployed in retail environments to: | ||
|
|
||
| - Predict future sales volumes across multiple stores and products | ||
| - Capture seasonal patterns and trends in customer purchasing behavior | ||
| - Support data-driven inventory management and purchasing decisions | ||
| - Provide actionable insights through visual forecasting dashboards | ||
|
|
||
| <div align="center"> | ||
| <br/> | ||
| <img alt="Forecast Dashboard" src="assets/forecast_dashboard.png" width="70%"> | ||
| <br/> | ||
| <p><em>HTML dashboard visualization showing forecasts with uncertainty intervals</em></p> | ||
| </div> | ||
|
|
||
| ## 🔍 Data Overview | ||
|
|
||
| The pipeline works with time-series retail sales data structured as follows: | ||
|
|
||
| | Field | Description | | ||
| |-------|-------------| | ||
| | date | Date of sales record (YYYY-MM-DD) | | ||
| | store | Store identifier (e.g., Store_1, Store_2) | | ||
| | item | Product identifier (e.g., Item_A, Item_B) | | ||
| | sales | Number of units sold | | ||
| | price | Unit price | | ||
|
|
||
| The system automatically handles: | ||
| - Multiple store/item combinations as separate time series | ||
| - Train/test splitting for model validation | ||
| - Proper data transformations required by Prophet | ||
| - Missing value imputation and outlier detection | ||
|
|
||
| <div align="center"> | ||
| <br/> | ||
| <img alt="Data Visualization" src="assets/data_visualization.gif" width="70%"> | ||
| <br/> | ||
| <p><em>Interactive visualization of historical sales patterns</em></p> | ||
| </div> | ||
|
|
||
| ## 🚀 Pipeline Architecture | ||
|
|
||
| The project includes two primary pipelines: | ||
|
|
||
| ### 1. Training Pipeline | ||
|
|
||
| The training pipeline performs the following steps: | ||
|
|
||
| 1. **Data Loading**: Imports historical sales data from CSV files | ||
| 2. **Data Preprocessing**: | ||
| - Transforms data into Prophet-compatible format | ||
| - Creates separate time series for each store-item combination | ||
| - Performs train/test splitting based on configurable ratio | ||
| 3. **Model Training**: | ||
| - Trains multiple Facebook Prophet models simultaneously, one for each store-item combination | ||
| - Configures seasonality parameters based on domain knowledge | ||
| - Handles price changes as regressors when available | ||
| 4. **Model Evaluation**: | ||
| - Calculates MAPE, RMSE, and MAE metrics on test data | ||
| - Generates visual diagnostics for model performance | ||
| 5. **Forecasting**: | ||
| - Produces forecasts with uncertainty intervals | ||
| - Creates interactive HTML visualizations | ||
|
|
||
| <div align="center"> | ||
| <br/> | ||
| <img alt="Training Pipeline DAG" src="assets/training_pipeline.png" width="70%"> | ||
| <br/> | ||
| <p><em>ZenML visualization of the training pipeline DAG</em></p> | ||
| </div> | ||
|
|
||
| ### 2. Inference Pipeline | ||
|
|
||
| The inference pipeline enables fast forecasting with pre-trained models: | ||
|
|
||
| 1. **Data Loading**: Imports the most recent sales data | ||
| 2. **Data Preprocessing**: Transforms data into Prophet format | ||
| 3. **Forecasting**: Generates predictions using production models | ||
| 4. **Visualization**: Creates interactive dashboards with forecasts | ||
|
|
||
| <div align="center"> | ||
| <br/> | ||
| <img alt="Inference Pipeline DAG" src="assets/inference_pipeline.png" width="70%"> | ||
| <br/> | ||
| <p><em>ZenML visualization of the inference pipeline DAG</em></p> | ||
| </div> | ||
|
|
||
| ## 📈 Model Details | ||
|
|
||
| The forecasting solution uses Facebook Prophet, chosen specifically for its combination of accuracy and simplicity in retail forecasting scenarios: | ||
|
|
||
| - **Multiple Models Approach**: Rather than a one-size-fits-all model, we generate individual Prophet models for each store-item combination, allowing forecasts that capture the unique patterns of each product in each location | ||
| - **Components**: Prophet automatically decomposes time series into trend, seasonality, and holidays | ||
| - **Seasonality**: Captures weekly, monthly, and yearly patterns in sales data | ||
| - **Special Events**: Handles holidays and promotions as custom seasonality effects | ||
| - **Uncertainty Estimation**: Provides prediction intervals for better inventory planning | ||
| - **Extensibility**: Supports additional regressors like price and marketing spend | ||
|
|
||
| Prophet was selected for this solution because it excels at: | ||
| - Handling missing data and outliers common in retail sales data | ||
| - Automatically detecting seasonal patterns without extensive feature engineering | ||
| - Providing intuitive parameters that business users can understand | ||
| - Scaling to thousands of individual time series efficiently | ||
|
|
||
|
|
||
| ## 💻 Technical Implementation | ||
|
|
||
| The project leverages ZenML's MLOps framework to provide: | ||
|
|
||
| - **Model Versioning**: Track all model versions and their performance metrics | ||
| - **Reproducibility**: All experiments are fully reproducible with tracked parameters | ||
| - **Pipeline Caching**: Speed up experimentation with intelligent caching of pipeline steps | ||
| - **Artifact Tracking**: All data and models are properly versioned and stored | ||
| - **Deployment Ready**: Models can be directly deployed to production environments | ||
|
|
||
| A key innovation in this project is the custom ProphetMaterializer that enables: | ||
htahir1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Serialization/deserialization of Prophet models for ZenML artifact storage | ||
| - Handling dictionaries of multiple Prophet models in a single artifact | ||
| - Efficient model loading for batch inference at scale | ||
htahir1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| <div align="center"> | ||
| <br/> | ||
| <img alt="ZenML Dashboard" src="assets/zenml_dashboard.png" width="70%"> | ||
| <br/> | ||
| <p><em>ZenML model registry tracking model versions and performance</em></p> | ||
| </div> | ||
|
|
||
| ## 🛠️ Getting Started | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| - Python 3.8+ | ||
htahir1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - ZenML installed and configured | ||
|
|
||
| ### Installation | ||
|
|
||
| ```bash | ||
| # Clone the repository | ||
| git clone https://github.com/zenml-io/zenml-projects.git | ||
| cd zenml-projects/retail-forecast | ||
|
|
||
| # Install dependencies | ||
| pip install -r requirements.txt | ||
|
|
||
| # Initialize ZenML (if needed) | ||
| zenml init | ||
| ``` | ||
|
|
||
| ### Running the Pipelines | ||
|
|
||
| To train models and generate forecasts: | ||
|
|
||
| ```bash | ||
| # Run the training pipeline (default) | ||
| python run.py | ||
|
|
||
| # Run with custom parameters | ||
| python run.py --forecast-periods 60 --test-size 0.3 --weekly-seasonality True | ||
| ``` | ||
|
|
||
| To make predictions using existing models: | ||
|
|
||
| ```bash | ||
| # Run the inference pipeline | ||
| python run.py --inference | ||
| ``` | ||
|
|
||
| ### Viewing Results | ||
|
|
||
| Start the ZenML dashboard: | ||
|
|
||
| ```bash | ||
| zenml login | ||
| ``` | ||
|
|
||
| Navigate to the dashboard to explore: | ||
| - Pipeline runs and their status | ||
| - Model performance metrics | ||
| - Interactive forecast visualizations | ||
| - Version history of all models | ||
|
|
||
| ## 🔄 Integration with Retail Systems | ||
|
|
||
| This solution can be integrated with existing retail systems: | ||
|
|
||
| - **Inventory Management**: Connect forecasts to automatic reordering systems | ||
| - **ERP Systems**: Feed forecasts into financial planning modules | ||
| - **BI Dashboards**: Export forecasts to Tableau, Power BI, or similar tools | ||
| - **Supply Chain**: Share forecasts with suppliers via API endpoints | ||
|
|
||
| ## 📊 Example Use Case: Store-Level Demand Planning | ||
|
|
||
| A retail chain with 50 stores and 500 products uses this pipeline to: | ||
|
|
||
| 1. Train models on 2 years of historical sales data | ||
| 2. Generate daily forecasts for the next 30 days for each store-item combination | ||
| 3. Aggregate forecasts to support central purchasing decisions | ||
| 4. Update models weekly with new sales data | ||
|
|
||
| The result: 15% reduction in stockouts and 20% decrease in excess inventory. | ||
|
|
||
|
|
||
| ## 📄 License | ||
|
|
||
| This project is licensed under the Apache License 2.0. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if this image could have a bit more of the zenml dashboard visible just so we reinforce the fact that it's visualizations from within zenml? or a GIF perhaps showing that this is interactive?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or actually I see you added a GIF below too. Maybe leave it then. |
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nothing actionable, but just noting that our DAG visualizer starts to look a bit scruffy when the DAG is complex. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| date,weekday,month,is_weekend,is_holiday,is_promo | ||
| 2024-01-01,0,1,0,1,0 | ||
| 2024-01-02,1,1,0,0,0 | ||
| 2024-01-03,2,1,0,0,0 | ||
| 2024-01-04,3,1,0,0,0 | ||
| 2024-01-05,4,1,0,0,0 | ||
| 2024-01-06,5,1,1,0,0 | ||
| 2024-01-07,6,1,1,0,0 | ||
| 2024-01-08,0,1,0,0,0 | ||
| 2024-01-09,1,1,0,0,0 | ||
| 2024-01-10,2,1,0,0,1 | ||
| 2024-01-11,3,1,0,0,1 | ||
| 2024-01-12,4,1,0,0,1 | ||
| 2024-01-13,5,1,1,0,1 | ||
| 2024-01-14,6,1,1,0,1 | ||
| 2024-01-15,0,1,0,1,1 | ||
| 2024-01-16,1,1,0,0,1 | ||
| 2024-01-17,2,1,0,0,1 | ||
| 2024-01-18,3,1,0,0,1 | ||
| 2024-01-19,4,1,0,0,1 | ||
| 2024-01-20,5,1,1,0,1 | ||
| 2024-01-21,6,1,1,0,0 | ||
| 2024-01-22,0,1,0,0,0 | ||
| 2024-01-23,1,1,0,0,0 | ||
| 2024-01-24,2,1,0,0,0 | ||
| 2024-01-25,3,1,0,0,0 | ||
| 2024-01-26,4,1,0,0,0 | ||
| 2024-01-27,5,1,1,0,0 | ||
| 2024-01-28,6,1,1,0,0 | ||
| 2024-01-29,0,1,0,0,0 | ||
| 2024-01-30,1,1,0,0,0 | ||
| 2024-01-31,2,1,0,0,0 | ||
| 2024-02-01,3,2,0,1,0 | ||
| 2024-02-02,4,2,0,0,0 | ||
| 2024-02-03,5,2,1,0,0 | ||
| 2024-02-04,6,2,1,0,0 | ||
| 2024-02-05,0,2,0,0,0 | ||
| 2024-02-06,1,2,0,0,0 | ||
| 2024-02-07,2,2,0,0,0 | ||
| 2024-02-08,3,2,0,0,0 | ||
| 2024-02-09,4,2,0,0,0 | ||
| 2024-02-10,5,2,1,0,1 | ||
| 2024-02-11,6,2,1,0,1 | ||
| 2024-02-12,0,2,0,0,1 | ||
| 2024-02-13,1,2,0,0,1 | ||
| 2024-02-14,2,2,0,0,1 | ||
| 2024-02-15,3,2,0,1,1 | ||
| 2024-02-16,4,2,0,0,1 | ||
| 2024-02-17,5,2,1,0,1 | ||
| 2024-02-18,6,2,1,0,1 | ||
| 2024-02-19,0,2,0,0,1 | ||
| 2024-02-20,1,2,0,0,1 | ||
| 2024-02-21,2,2,0,0,0 | ||
| 2024-02-22,3,2,0,0,0 | ||
| 2024-02-23,4,2,0,0,0 | ||
| 2024-02-24,5,2,1,0,0 | ||
| 2024-02-25,6,2,1,0,0 | ||
| 2024-02-26,0,2,0,0,0 | ||
| 2024-02-27,1,2,0,0,0 | ||
| 2024-02-28,2,2,0,0,0 | ||
| 2024-02-29,3,2,0,0,0 | ||
| 2024-03-01,4,3,0,1,0 | ||
| 2024-03-02,5,3,1,0,0 | ||
| 2024-03-03,6,3,1,0,0 | ||
| 2024-03-04,0,3,0,0,0 | ||
| 2024-03-05,1,3,0,0,0 | ||
| 2024-03-06,2,3,0,0,0 | ||
| 2024-03-07,3,3,0,0,0 | ||
| 2024-03-08,4,3,0,0,0 | ||
| 2024-03-09,5,3,1,0,0 | ||
| 2024-03-10,6,3,1,0,1 | ||
| 2024-03-11,0,3,0,0,1 | ||
| 2024-03-12,1,3,0,0,1 | ||
| 2024-03-13,2,3,0,0,1 | ||
| 2024-03-14,3,3,0,0,1 | ||
| 2024-03-15,4,3,0,1,1 | ||
| 2024-03-16,5,3,1,0,1 | ||
| 2024-03-17,6,3,1,0,1 | ||
| 2024-03-18,0,3,0,0,1 | ||
| 2024-03-19,1,3,0,0,1 | ||
| 2024-03-20,2,3,0,0,1 | ||
| 2024-03-21,3,3,0,0,0 | ||
| 2024-03-22,4,3,0,0,0 | ||
| 2024-03-23,5,3,1,0,0 | ||
| 2024-03-24,6,3,1,0,0 | ||
| 2024-03-25,0,3,0,0,0 | ||
| 2024-03-26,1,3,0,0,0 | ||
| 2024-03-27,2,3,0,0,0 | ||
| 2024-03-28,3,3,0,0,0 | ||
| 2024-03-29,4,3,0,0,0 | ||
| 2024-03-30,5,3,1,0,0 |
Uh oh!
There was an error while loading. Please reload this page.