Adaptive Optimization of PV Battery Systems: A Reinforcement Learning Approach

Bachelor Thesis Project by Malte Matthey

A modular research framework developed to systematically investigate the practical feasibility of Deep Reinforcement Learning (DRL) for Home Energy Management Systems (HEMS). This project evaluates multiple DRL algorithms (DQN, SAC, TD3, PPO) against classical rule-based controllers using synthesized photovoltaic (PV) data, imperfect weather forecasts, and dynamic day-ahead electricity prices.

📄 Thesis Documentation

The full text includes the mathematical MDP formulation, the custom environment architecture, and a critical evaluation of the agents.

📥 Download the Bachelor Thesis (PDF)

🧩 System Modules (The Codebase)

To cleanly separate the data lifecycle from the training logic, the project architecture is divided into three distinct modular repositories. As illustrated in the deployment architecture below, the data flows sequentially from local preprocessing, into a central data management server, and finally to the RL framework executed on an HPC cluster. The source code for each module can be found in the list below.

1. Data Preprocessing (Data Engineering)

The entry point of the pipeline, responsible for parsing and cleaning heterogeneous, file-based raw household consumption and PV generation data. It features a standardized, object-oriented workflow for parsing varying data formats (e.g., HDF5, CSV), missing-value interpolation, daylight saving time corrections, and strict JSON schema-based validation prior to downstream ingestion.

2. Data Management Service (Backend Engineering)

A containerized backend service that orchestrates the entire data unification and synthesis pipeline, acting as the single source of truth for the experimental datasets. Built with FastAPI and PostgreSQL extended with TimescaleDB for high-performance time-series management. The service handles automated API fetching with Backblaze B2 caching, multi-stage PV generation simulation (using pvlib), synthetic imperfect weather forecast generation via quantile regression (lightgbm), and dynamic LOCF (Last-Observation-Carried-Forward) SQL query building.

3. RL Agent Framework (Machine Learning & MLOps)

The core research environment used for training, hyperparameter tuning, and evaluating the agents. It includes the implementation of custom Gymnasium environments with efficient in-memory data sourcing and vectorization. The framework provides a declarative configuration system, time-aware k-fold cross-validation, automated HPO sweeps with a hybrid early-stopping strategy, and seamless integration with a High-Performance Computing (HPC) cluster via Slurm.

🛠️ Core Tech Stack

Machine Learning: Stable Baselines3, sb3-contrib, Gymnasium, scikit-learn
Data Processing & Simulation: Pandas, NumPy, pvlib, lightgbm, Apache Parquet
Backend & Database: FastAPI, PostgreSQL, TimescaleDB, SQLAlchemy Core, asyncpg
MLOps & Infrastructure: Weights & Biases (wandb), Docker, Docker Compose, GitLab CI/CD, Slurm Workload Manager, Backblaze B2

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
images		images
sections		sections
.gitignore		.gitignore
README.md		README.md
bachelor_thesis.tex		bachelor_thesis.tex
bachelor_thesis_malte_matthey.pdf		bachelor_thesis_malte_matthey.pdf
references.bib		references.bib
thesis.sty		thesis.sty

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive Optimization of PV Battery Systems: A Reinforcement Learning Approach

📄 Thesis Documentation

🧩 System Modules (The Codebase)

1. Data Preprocessing (Data Engineering)

2. Data Management Service (Backend Engineering)

3. RL Agent Framework (Machine Learning & MLOps)

🛠️ Core Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Adaptive Optimization of PV Battery Systems: A Reinforcement Learning Approach

📄 Thesis Documentation

🧩 System Modules (The Codebase)

1. Data Preprocessing (Data Engineering)

2. Data Management Service (Backend Engineering)

3. RL Agent Framework (Machine Learning & MLOps)

🛠️ Core Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages