View report for detailed understaning of this project in file titled cse258_report_final.pdf
COMPLETED
- Test the hot hand hypothesis, does a player’s shooting percentage improve as they make more and more shots. Generated plots for the 17 players with the best hot hands.
- Test the hypothesis of the clutch gene, is it true some players are significantly better than others at scoring, in terms of points and efficiency in end of game situations
- Deep learning model testing with various features
- Standart dataset Visualisation
- Shot prediction task - Will the player make his shot based on some features (Deep learning model + Logistic Regression model)
- Shot prediction task - More evaluation using various metrics and explanations regarding which metrics were better
- Recommendation task - Who is the best defender against a certain player(Recommend the best defender on any particular player. Choose a player, check stats of player against that defender and suggest best defenders based on position.)
DATA
- shot_logs_assignment.csv contains the latest dataset with data being used for all tasks Started by first creating a folder and downloading the two datasets into it Players_1.csv - That all personal information regarding the player - name, age, college etc etc Shot_logs.csv - Main dataset that contains every shot attempted over the course of the 2014-15 NBA season(Last two months missing).
USEFUL STATISTICS REGARDING THE DATA
- 281 offensive players only in total
- Some major players don't contain shot logs corresponding to them. Games start from novemeber goes through march.
REPORT
- Initial Hypotheses: Assumptions about the NBA regarding players that we plan to test
- Literature Review: Need to find potential papers discussing this(unlikely) or other more professional blog posts/kagglers
- Data collection and cleaning : Explanation of the two datasets, what all we are doing to combine the two datasets and make certain columns more easily accessible as features in the data
- Plots/Analysis of the data
- Models made using the data
- Evaluations of the model using various metrics
- Conclusions of hypotheses