A comprehensive comparison of time series forecasting techniques applied to hourly energy consumption data, from classical statistical models (ARIMA, SARIMAX) to modern approaches (Prophet, N-HiTS). This project demonstrates the evolution and performance differences between traditional and state-of-the-art forecasting methods.
The dataset used is the Kaggle Hourly Energy Consumption dataset, specifically the PJME (PJM East) region.
Dataset Characteristics:
- 145,392 hourly observations from January 2002 to August 2018
- Original columns:
Datetime,PJME_MW(consumption in megawatts) - Strong seasonality patterns: hourly, daily, weekly, and annual cycles
- Clear seasonal patterns with higher consumption during summer (air conditioning) and winter (heating) months
The dataset underwent thorough cleaning and feature engineering to create a robust foundation for modeling:
Cleaning Steps:
- Duplicate removal: Eliminated duplicate timestamps
- Missing value handling: Interpolated gaps up to 5 consecutive hours using time-based interpolation
- Reindexing: Created complete hourly grid from 2002-01-01 00:00 to 2018-08-03 05:00
- DST correction: Handled days with 23 or 25 hours due to Daylight Saving Time transitions
Feature Engineering:
Hour(0-23): Hour of the dayMonth(1-12): Month of the yearDayOfWeek(0-6): Day of the week (Monday=0, Sunday=6)is_weekend(Boolean): Weekend indicatoris_holiday(Boolean): US Federal holiday indicator
Final Dataset: datetime, consumption, Hour, Month, DayOfWeek, DayOfWeekName, is_weekend, is_holiday
ARIMA is the most widely-known classical statistical model for time series analysis, defined by three parameters: (p, d, q)
Model Components:
-
AR (AutoRegressive) - p: Regression of the series on its own past values. Parameter
pspecifies how many lagged observations to include in the model. -
Integrated - d: Number of differencing operations needed to make the series stationary (i.e., statistical properties like mean and variance don't change over time). Parameter
dspecifies the order of differencing. -
MA (Moving Average) - q: Regression on past forecast errors (residuals). Parameter
qspecifies how many lagged forecast errors to include in the model.
Implementation Details:
- Hyperparameter tuning: Used Optuna for automated hyperparameter optimization
- Parameter selection: ACF (AutoCorrelation Function) and PACF (Partial AutoCorrelation Function) plots analyzed to guide parameter search space
- Final configuration: Order (3, 1, 0) - 3 autoregressive terms, 1st order differencing, no moving average terms
- MLflow tracking: All experiments, parameters, and metrics logged for reproducibility
- Final test MAPE: 18.15% (excluded from main comparison due to different test set partition)
SARIMAX extends ARIMA by combining two separate ARIMA processes, one for non-seasonal patterns (p, d, q) and one for seasonal patterns (P, D, Q, s), while also incorporating exogenous features.
Model Structure:
- Non-seasonal component (p, d, q): Captures short-term patterns and trends
- Seasonal component (P, D, Q, s): Captures recurring patterns at seasonal lag
s - Exogenous variables (X): External factors that influence the target variable
Implementation Details:
- Dataset scope: Due to computational constraints, trained only on 2013-2016 data (35,064 hours)
- Test set: 2017-2018 (13,896 hours)
- Hyperparameter tuning: 70 trials using Optuna optimization
- Final configuration: (2, 0, 1) × (2, 1, 2, 24)
- Non-seasonal: 2 AR terms, no differencing, 1 MA term
- Seasonal: 2 seasonal AR terms, 1 seasonal differencing, 2 seasonal MA terms, period = 24 hours (daily seasonality)
Exogenous Features:
Hour,Month,DayOfWeek: Temporal indicatorsis_holiday: Binary holiday indicatorHour_sin,Hour_cos,Month_sin,Month_cos: Cyclical encodings (captures that 23:00 and 00:00 are adjacent, December and January are adjacent)
Prophet is an open-source forecasting library developed by Meta that uses an additive decomposition model particularly well-suited for business time series with strong seasonal patterns and missing data.
Model Equation:
y(t) = g(t) + s(t) + h(t) + εₜ
Where:
- g(t): Piecewise linear or logistic growth trend
- s(t): Multiple seasonal components (daily, weekly, yearly) modeled using Fourier series
- h(t): Holiday and special event effects with custom windows
- εₜ: Error term (normally distributed noise)
Implementation Details:
- Dataset scope: Full training data (2002-2016: 131,496 hours)
- Test set: 2017-2018 (13,896 hours)
- Hyperparameter tuning: 70 trials with Optuna
Exogenous Features (Regressors):
is_holiday: Federal holidaysis_weekend: Weekend indicatoris_night(00:00-06:00): Nighttime low consumption periodsis_business_hours(08:00-18:00 weekdays): Peak business activityis_peak_hours(07:00-09:00, 17:00-20:00): Morning and evening peaksis_summer,is_fall,is_winter,is_spring: Seasonal indicatorsis_monday,is_tuesday...
N-HiTS is a state-of-the-art deep learning architecture specifically designed for long-horizon time series forecasting. It uses a hierarchical neural network structure to capture patterns at multiple time scales simultaneously.
Architecture Overview:
- Multi-rate signal processing: Uses multiple "stacks" of blocks, each operating at different temporal resolutions
- Hierarchical interpolation: Learns to decompose the forecast into different frequency components (high-frequency hourly patterns → low-frequency weekly/yearly trends)
- Backcast/Forecast structure: Each block produces both a backcast (explanation of past) and forecast (prediction of future)
- Unlike traditional methods that require manual specification of seasonal patterns, N-HiTS automatically learns the hierarchical structure of seasonality from raw data through its multi-scale architecture.
Implementation Details:
- Dataset split:
- Training: 2002-2015 (~121,000 hours)
- Validation: 2016 (8,760 hours) - used for early stopping and hyperparameter selection
- Test: 2017-2018 (13,896 hours)
- Hyperparameter tuning: 70 trials with Optuna
- Hardware acceleration: Trained on RTX 5090 GPU
All models were evaluated on the same test set (2017-2018, 13,896 hours) using three standard forecasting metrics:
-
RMSE (Root Mean Squared Error): Average squared error magnitude in MW, emphasizing large deviations. Lower is better.
-
MAE (Mean Absolute Error): Average absolute error magnitude in MW, treating all errors equally. Lower is better.
-
MAPE (Mean Absolute Percentage Error): Percentage deviation from actual consumption, scale-independent. Lower is better.
| Model | RMSE (MW) | MAE (MW) | MAPE (%) | Training Time |
|---|---|---|---|---|
| SARIMAX | 5,087 | 3,985 | 12.80% | ~18 hours |
| Prophet | 3,542 | 2,635 | 8.03% | ~12 minutes |
| N-HiTS | 2,320 | 1,568 | 4.90% | ~14 minutes |
Note: ARIMA results excluded as it used a different test set partition. Its test MAPE was 18.15%, proving inefficient and very slow to execute.
SARIMAX captures the daily pattern (24-hour cycle) but misses weekly and yearly patterns, resulting in systematic under/over-prediction during certain periods.
Advantages: Handles one seasonal pattern with exogenous variables, more flexible than ARIMA
Limitations: Limited to single seasonality (daily only), extremely slow training (~18 hours), struggles with overlapping seasonal patterns
Prophet successfully models multiple seasonalities (daily, weekly, yearly), showing significant improvement over SARIMAX. However, forecasts still exhibit moderate deviations, particularly during extreme weather periods.
Advantages: Handles multiple seasonalities simultaneously, interpretable decomposition (trend/seasonal/holiday), fast training (~12 min)
Limitations: Additive model structure limits non-linear interactions, moderate errors during extreme events
N-HiTS achieves state-of-the-art performance, with forecasts that closely track actual values by capturing all hierarchical patterns and complex non-linear relationships.
Advantages: Automatically learns multi-scale patterns (hourly to yearly), captures non-linear interactions, designed for long-horizon forecasting, fast training (~14 min)
Limitations: Requires GPU for optimal performance, less interpretable than statistical models, needs larger datasets
- Python 3.8 or higher
- NVIDIA GPU with CUDA support (optional but recommended for N-HiTS)
- 8GB+ RAM
- MLflow for experiment tracking
- Clone the repository:
git clone https://github.com/dream-19/Time_Series_PJME_hourly_consumption.git
cd Energy_Consumption_Time_Series- Install dependencies:
pip install -r requirements.txtThis project is licensed under the MIT License LICENSE



