This repo is an archive of my code used during a machine learning competition which was part of a seminar at the University of Siegen.
The seminar was divided into 4 Milestones: Data Exploration, Model Building, Data Engineering, Exploitation.
-
The first milestone (Data Exploration) was completely done with plain python files in this repo. It can be found in
src/data_exploration/ -
The second milestone (Model Building) I started out with some simple models in
src/model_building/, but I then switched to interactive Jupyter Notebooks, those can be found innotebooks/ -
The third and fourth milestone (Data Engineering and Exploitation) are done inside notebooks as well, but the most part of these milestones was done in Windows Subsystem for Linux (WSL), for which I have a separate repo that you can find HERE.
Look out for separate READMEs in each folder. You can find more details on the contents of the folder and the results obtained inside these READMEs.
To make some data loading and preprocessing tasks that I frequently use a bit easier, I created a small library, which is needed for the newer notebooks to run. You can find more information on that in its own repo.