The Football Fantasy App is an end-to-end football analytics pipeline that combines data scraping, database engineering, machine learning, backend development, and frontend design. Historical UEFA Champions League (UCL) data (2017–2024) was scraped and cleaned, stored in a MySQL database, and served through a Spring Boot REST API. A Random Forest Classifier was trained with engineered features such as rolling 5-game form statistics to predict match outcomes. The React.js frontend, integrated with backend APIs, displays structured standings and results for the 2024–25 UCL season, making the app a full-stack ML-powered platform from raw data to insights.
-
Data Acquisition
scraping.ipynbwas used to scrape team standings and player stats from FBref using Selenium (Python) (2024–25).- Pulled structured match results with worldfootballR (R package) (2017–24).
-
Data Cleaning
clean_data.pystandardizes team names, fixes UTF-8 encoding, removes country prefixes, and ensures dataset consistency.- Cleaned outputs saved back into
/datafor downstream use.
-
Data Storage
- Designed a MySQL schema in
import_data.sqlfor storing UCL player statistics. - Used
LOAD DATA LOCAL INFILEfor bulk CSV import, withNULLIF()to handle missing values andIGNOREclauses to skip unnecessary columns. - Configured session modes (disabled strict mode) to ensure smooth ingestion while preserving all rows.
- Designed a MySQL schema in
-
Feature Engineering & ML
prediction.ipynbencodes categorical features, builds rolling 5-game form stats, and trains a Random Forest Classifier.- Accuracy improved from ~0.48 (baseline) → ~0.55–0.60 (with form features).
- Training data: 2017–24 seasons.
-
Backend (Spring Boot + MySQL)
srcexposes REST APIs to serve match results and model predictions.- APIs interact directly with the MySQL database.
-
Frontend (React.js)
Frontend/contains a responsive React app that displays UCL 2024–25 standings and results.- Integrated with backend APIs, fixed ranking issues, and optimized for responsiveness.