Skip to content

This is the Master Thesis Repo for my MSc Data Science for Public Policy degree at Hertie School, Berlin

Notifications You must be signed in to change notification settings

adityanarayan-rai/bicycle-predictions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Forecasting Cycling Volume Measurements: A Cross-city Comparative Analysis Using Machine Learning and Deep Learning Techniques

This repository complements the master thesis submitted in partial fulfilment of the requirements of the Hertie School for the degree of Master of Science in Data Science for Public Policy.

NOTE - THIS REPO IS STILL A WORK IN PROGRESS!!

Abstract

Cycling is an increasingly vital component of sustainable urban mobility strategies, yet predicting bicycle volumes remains a complex task due to high variability across spatial, temporal, and environmental dimensions. This study develops and applies a harmonized machine learning and deep learning framework to forecast daily bicycle volumes across two distinct urban contexts: Berlin and New York City. Using long-term automated bicycle counts integrated with weather, land use, traffic, infrastructure, and temporal data, models including decision trees, ensemble methods, and neural networks were evaluated under two experimental designs: station-specific temporal holdout and leave-one-group-out cross-validation (LOGO-CV). Feature selection and hyperparameter tuning were systematically applied to optimize performance. Results demonstrated that machine learning and deep learning models significantly outperformed baseline time series approaches in both cities under temporal holdout, with Decision Tree models achieving the best overall results. However, under LOGO-CV, predictive accuracy declined notably, especially in New York, reflecting the greater spatial heterogeneity of cycling behaviour. Feature importance analysis revealed that infrastructure and spatial variables were dominant predictors in Berlin, whereas localized land cover characteristics played a larger role in New York. The findings highlight the challenges of spatial generalization and underline the importance of urban context in modelling bicycle volumes. Overall, this study contributes to the growing body of research on predictive analytics for active transportation and provides practical insights for enhancing bicycle volume modelling in diverse urban environments.

Poster

Thesis Poster

About

This is the Master Thesis Repo for my MSc Data Science for Public Policy degree at Hertie School, Berlin

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors