Project developed for the Information Retrieval 24-25 course at University of Pisa.
The project consists in a really simple partial reimplementation of XGBoost in Python using the Numpy library. The main goal behind the project was to better understand the reasoning behind the original work by trying to translate its ideas without focussing so much on low-level optimisation. Where possible some efficiency optimisation have been considered, mainly for the gradient booster tree that has been represented through its succint representation.
A more in depth description is available in the project's presentation project-presentation.pdf.
To verify the correctness of the implementation, the project has been tested on the Criteo and Higgs datasets, comparing the obtained results with those of the original library. These tests are available in the notebook test.ipynb
