Skip to content

samirsaci/ml-forecast-features-eng

Repository files navigation

Machine Learning for Retail Sales Forecasting — Feature Engineering 📈

Understand the impacts of additional features related to stock-out, store closing date or cannibalisation on a Machine Learning model for sales forecasting

Based on the feedback of the last Makridakis Forecasting Competitions, Machine Learning models can reduce the forecasting error by 20% to 60% compared to benchmark statistical models.

Their significant advantage is the ability to integrate external features that significantly affect sales variability.

For example, e-commerce cosmetics sales are driven by special events (promotions) and by how you advertise a reference on the website (first page, second page, …).

This process called features engineering is based on analytical concepts and business insights to understand what could drive your sales.

Article

In this Article, we examine the impact of several features on model accuracy using the M5 Forecasting competition dataset.

Experiment

Based on business insights and common sense, we will add features built on existing ones to help our model capture all the key factors affecting your customer demand.

Data set

This analysis will be based on the M5 Forecasting dataset of Walmart store sales records (Link).

Code

  1. Create a folder named Data in your directory where the notebook is located
  2. Download all the files of the Kaggle forecasting competition (Link).
  3. Launch the notebook

About me 🤓

Senior Supply Chain and Data Science consultant with international experience working on Logistics and Transportation operations.
For consulting or advising on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting
Please have a look at my personal blog: Personal Website