Student Performance Prediction

End to End Machine Learning Project

Student Performance Prediction

Introduction About the Data :

The Dataset goal is to predict Math Score of students (Regression Analysis). It consists of 8 columns and 1000 rows.

It has 3 numerical features : ['math_score', 'reading_score', 'writing_score']

It has 5 categorical features : ['gender', 'race_ethnicity', 'parental_level_of_education', 'lunch', 'test_preparation_course']

There are 7 independent variable.

gender: sex of students -> male/female
race/ethnicity: ethnicity of students -> (Group A, B, C, D, E)
parental_level_of_education: parents' final education ->(bachelor's degree,some college,master's degree,associate's degree,high school)
lunch: having lunch before test (standard or free/reduced)
test_preparation_course: complete or not complete before test
reading_score: score obtain by student in reading
writing_score: score obtain by student in writing

Target Variable:

math_score: score obtain by student in math subject

Data Source Link : (https://www.kaggle.com/datasets/spscientist/students-performance-in-exams?datasetId=74977) [https://www.kaggle.com/datasets/spscientist/students-performance-in-exams?datasetId=74977]

Screenshot of UI

Approach for the project

Data Ingestion :
- In Data Ingestion phase the data is first read as csv.
- Then the data is split into training and testing and saved as csv file.
Data Transformation :
- In this phase a ColumnTransformer Pipeline is created.
- for Numeric Variables first SimpleImputer is applied with strategy median , then Standard Scaling is performed on numeric data.
- for Categorical Variables SimpleImputer is applied with most frequent strategy, then OneHotEncoder performed , after this data is scaled with Standard Scaler.
- This preprocessor is saved as pickle file.
Model Training :
- In this phase base model is tested . The best model found was Linear regression.
- After this hyperparameter tuning is performed on and Linear regression is the best model.
- This model is saved as pickle file.
Prediction Pipeline :
- This pipeline converts given data into dataframe and has various functions to load pickle files and predict the final results in python.
Flask App creation :
- Flask app is created with User Interface to predict the Student Performance for Math score inside a Web Application.

Exploratory Data Analysis Notebook

Link : EDA Notebook

Model Training Approach Notebook

Link : Model Training Notebook

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.ebextensions		.ebextensions
.vscode		.vscode
artifacts		artifacts
catboost_info		catboost_info
notebook		notebook
screenshorts		screenshorts
src		src
static		static
templates		templates
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
application.py		application.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

End to End Machine Learning Project

Student Performance Prediction

Introduction About the Data :

Screenshot of UI

Approach for the project

Exploratory Data Analysis Notebook

Model Training Approach Notebook

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Aadarsh4u-code/mlproject

Folders and files

Latest commit

History

Repository files navigation

End to End Machine Learning Project

Student Performance Prediction

Introduction About the Data :

Screenshot of UI

Approach for the project

Exploratory Data Analysis Notebook

Model Training Approach Notebook

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages