This project aims to predict the math score of students based on various academic and demographic features. It showcases a full machine learning workflow, from data preprocessing and EDA to feature engineering and model evaluation.
| Column | Description |
|---|---|
| gender | Student's gender (male / female) |
| race_ethnicity | Student group based on race/ethnicity |
| parental_level_of_education | Parent's highest education level |
| lunch | Type of lunch (standard / free/reduced) |
| test_preparation_course | Completion of test preparation course |
| reading_score | Reading test score |
| writing_score | Writing test score |
| math_score | Target variable - score in mathematics |
| gender | race_ethnicity | parental_level_of_education | lunch | test_preparation_course | reading_score | writing_score | math_score |
|---|---|---|---|---|---|---|---|
| female | group B | bachelor's degree | standard | none | 72 | 74 | 72 |
| female | group C | some college | standard | completed | 90 | 88 | 69 |
| female | group B | master's degree | standard | none | 95 | 93 | 90 |
| male | group A | associate's degree | free/reduced | none | 57 | 44 | 47 |
| male | group C | some college | standard | none | 78 | 75 | 76 |
- 📊 Conduct Exploratory Data Analysis (EDA) to understand distributions and correlations.
- 🧹 Preprocess data with encoding, scaling, and transformation techniques.
- 🧠 Train and evaluate multiple regression models to predict
math_score. - 📈 Evaluate models using R², MAE, and RMSE metrics.
- Categorical Encoding: Label & One-Hot Encoding
- Feature Engineering: Derived features & normalization
- Multiple Regression Models: Tried various ML regressors
- Evaluation Metrics: R² Score, MAE, RMSE
- 📚 Students who completed the test preparation course generally scored higher.
- ✍️ Reading and writing scores are strongly correlated with math scores.
- 🎓 Parental education and lunch type influence student performance.
- Python (Jupyter Notebook)
- Pandas, NumPy – Data manipulation
- Matplotlib, Seaborn – Visualizations
- Scikit-learn – Machine learning
- Flask – Web app deployment (optional)
- Clone the repository or download the notebook file.
- Install dependencies:
pip install -r requirements.txt - Run the notebook step-by-step in Jupyter.
- Optional: Start Flask app using:
python app.py