Skip to content

matthew-e-thomas/Spark_ML_DS5559

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark_ML_DS5559

Machine learning project with PySpark for UVa Data Science Big Data Class

We are using the Lending Club dataset on Kaggle: https://www.kaggle.com/wordsforthewise/lending-club

Our goal is to use PySpark to clean the data and use the ML library to do feature selection, hyperparameter tuning, and compare various machine learning algorithms to try to predict whether a borrower will default on their loan or not.

Contributors:

Max McGaw https://github.com/mmcgaw182

Will Carruthers https://github.com/wcarruthers

Liam Mulcahy https://github.com/liamtmul

Matt Thomas

About

Machine learning project with PySpark for UVa Data Science Big Data Class

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •