Skip to content

dvermagithub/GottaGoFast

Repository files navigation

Project: GottaGoFast

This is the project webpage for MSCS 5610 Data Mining (Spring 2019) Team GottaGoFast.

Context & Acknowledgement

For this project we are exploring the dataset from Formula 1 racing available from http://ergast.com/mrd/. The Ergast Developer API is an experimental web service which provides a historical record of motor racing data for non-commercial purposes. The API provides data for the Formula One series, from the beginning of the world championships in 1950. The data was originally gathered and published to the public domain by Chris Newell. Formula One (also Formula 1 or F1 and officially the FIA Formula One World Championship) is the highest class of single-seat auto racing that is sanctioned by the Fédération Internationale de l'Automobile (FIA). The FIA Formula One World Championship has been one of the premier forms of racing around the world since its inaugural season in 1950.

Content

This dataset contains data from 1950 all the way through the 2018 season, and consists of tables describing race results, constructors results, constructors, race drivers, lap times, pit stops, qulification results, and many more as per the scheme provided at http://ergast.com/schemas/f1db_schema.txt

Outcomes

As the project proceeds, this page will be updated. The repository contain the original data download and additional folders include various stages of the Data Mining process as utilized with this data set.

Exploratory Data Analysis

The exploratory data analysis Jupyter notebook can be see under the "Reports and Reading" folder or at: https://github.com/dvermagithub/GottaGoFast/blob/master/Reports%20and%20Reading/F1%20Exploratory%20Analysis%20Full%20Data.ipynb

Midway Progress Report

The midway progress report is located at under the "Report and Reading" folder or at: https://github.com/dvermagithub/GottaGoFast/blob/master/Reports%20and%20Reading/Midway_Report_GottaGoFast-41409.docx

Classification and Clustering Analysis

Classfication Analayis which included Linear Regression, Decision Tree - Entroty, Decision Tree - Gini, Naive Bayes, MLP and KNN was performed on the data. Additionally Clustering analysis was also performed. The results are located as follows: Classification using various algorithms (by Deepak) - https://github.com/dvermagithub/GottaGoFast/blob/master/04-%20Clustering%20and%20Classification%20Analysis/F1%20-%20Supervised%20Learning.ipynb

KNN Classification (by Lezeh) - https://github.com/dvermagithub/GottaGoFast/blob/master/04-%20Clustering%20and%20Classification%20Analysis/KNN%20Classification%20Model%20with%20results_full.ipynb

Clustering (by Zack) - https://github.com/dvermagithub/GottaGoFast/blob/master/04-%20Clustering%20and%20Classification%20Analysis/F1%20Clustering%20Analysis%20with%20results_full_4v3.ipynb

Final Presentation & Report

The final presentation that outlines the obversation and conclusion is included in the folder named '05- Final Presentation and Report" along with the final project report.

About

Exploratory and Machine Learning Analysis of F1

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •