The Student Performance Predictor is an end-to-end machine learning web application that predicts a student’s academic performance based on demographic and academic input features.
This project demonstrates:
- Full ML pipeline development
- Model training & evaluation
- Web deployment using Flask
- Data preprocessing & feature engineering
- Multiple regression models evaluated
- Final model: Linear Regression with hyperparameter tuning
- Interactive web interface (Flask)
- Real-time predictions
- Deployed on AWS Elastic Beanstalk
- Automated deployment via CI/CD pipeline
Dataset includes student attributes such as gender, race/ethnicity, parental education, lunch type, test preparation course, reading & writing scores.
- Identified feature relationships
- Checked distributions and correlations
- Encoding categorical variables
- Feature scaling
- Pipeline creation
Models evaluated:
- Linear Regression
- Ridge
- Lasso
- Decision Tree
- Random Forest
- CatBoost
Best model: Linear Regression with hyperparameter tuning
Artifacts stored in: artifacts/
- model.pkl
- preprocessor.pkl
PROJECT/
- artifacts/
- notebook/
- src/
- templates/
- static/
- application.py
- requirements.txt
- setup.py
git clone cd student-performance-predictor python -m venv venv venv\Scripts\activate pip install -r requirements.txt
python application.py Open: http://localhost:5000
Deployed using AWS Elastic Beanstalk with CI/CD pipeline.
- Gender
- Race/Ethnicity
- Parental education
- Lunch
- Test preparation
- Reading score
- Writing score
Predicted Math Score
- Python
- Pandas, NumPy
- Scikit-learn
- CatBoost
- Flask
- HTML, CSS
- AWS
- Advanced models
- Better UI
- Docker deployment