Skip to content

An intelligent machine learning application that predicts student exam scores and provides personalized recommendations for academic improvement using advanced AI and data analytics.

Notifications You must be signed in to change notification settings

Ayushkumar418/Student_Performance_Predictor

Repository files navigation

πŸŽ“ Student Performance Predictor

Python 3.12 Streamlit ML License

An intelligent machine learning application that predicts student exam scores and provides personalized recommendations for academic improvement using advanced AI and data analytics.

πŸš€ Quick Start

Prerequisites

  • Python 3.12 or higher
  • pip (Python package manager)
  • ~2GB free disk space

Installation

  1. Clone the repository
git clone https://github.com/Ayushkumar418/Student_Performance_Predictor
cd student-performance-predictor
  1. Create a virtual environment (optional but recommended)
python -m venv venv
source venv/Scripts/activate  # On Windows
# or
source venv/bin/activate  # On macOS/Linux
  1. Install dependencies
pip install -r requirements.txt

Required packages:

  • streamlit
  • pandas
  • numpy
  • scikit-learn
  • joblib
  • plotly
  • statsmodels
  1. Verify installation
python verify_system.py

πŸ’‘ First-time setup? See the detailed First Time Setup Guide for step-by-step instructions including model training, verification, and testing in the correct order.

Running the Application

Simple Version (3 tabs):

streamlit run app.py

Advanced Version (5 tabs) - Recommended:

streamlit run app_advanced.py

The app will open in your browser at: http://localhost:8501


πŸ“Š Features

πŸ“ˆ Prediction Dashboard

  • Manual Input: Enter 24+ student factors
  • Real-time Prediction: Get instant exam score (0-100)
  • Performance Metrics:
    • Predicted vs Class Average
    • Percentile Ranking
    • Confidence Intervals (90% & 95%)
  • Personalized Recommendations: 10+ actionable tips

πŸ“Š Next Semester Score (app.py only)

  • View student semester history
  • Analyze performance trends
  • Predict next semester performance
  • Trend-based recommendations

πŸ” Advanced Analytics (app_advanced.py)

  • Feature Importance: See what factors matter most
  • Prediction Confidence: Understand uncertainty levels
  • Student Analytics:
    • Score distribution
    • Attendance vs Performance
    • Study hours correlation
    • GPA analysis
  • Model Comparison: View all 3 trained models
  • Cross-validation Results: 5-fold validation metrics

πŸ“ Project Structure

student-performance-predictor/
β”œβ”€β”€ app.py                              # Simple 3-tab application
β”œβ”€β”€ app_advanced.py                     # Advanced 5-tab dashboard ⭐
β”œβ”€β”€ train_advanced.py                   # Model training pipeline
β”œβ”€β”€ verify_system.py                    # System verification
β”œβ”€β”€ test_app.py                         # Application tests
β”‚
β”œβ”€β”€ StudentPerformanceFactors.csv       # Dataset (6,607 students)
β”‚
β”œβ”€β”€ student_performance_model.pkl       # Trained model (Linear Regression)
β”œβ”€β”€ all_models.pkl                      # Backup models (RF, GB)
β”œβ”€β”€ scaler.pkl                          # Feature normalizer
β”‚
β”œβ”€β”€ model_results.json                  # Performance metrics
β”œβ”€β”€ feature_importance.json             # Feature rankings
β”œβ”€β”€ residuals.json                      # Confidence data
β”œβ”€β”€ analysis_summary.json               # Dataset insights
β”‚
β”œβ”€β”€ README.md                           # This file
β”œβ”€β”€ TECHNICAL.md                        # Technical documentation
β”œβ”€β”€ requirements.txt                    # Python dependencies
└── .gitignore                          # Git ignore file

πŸ“Š Model Information

Algorithm

Linear Regression with Feature Engineering

Accuracy

  • βœ… Test Accuracy: 100% (RΒ² = 1.0000)
  • βœ… Cross-Validation: 1.0000 Β± 0.0000 (5-fold)
  • βœ… Mean Absolute Error: 0.00 points

Features Used

  • Total: 35 features
    • 19 original features
    • 16 engineered features (interactions, polynomials, composites)

Dataset

  • Students: 6,607 records
  • Columns: 34 attributes
  • Score Range: 0-100
  • GPA Range: 0-10 (scaled from 0-4)

Top Predictive Factors

  1. πŸ“š Cumulative GPA (strongest predictor)
  2. πŸ“ Attendance Rate (58% correlation)
  3. ⏱️ Study Hours (45% correlation)
  4. 🎀 Class Participation (43% correlation)
  5. πŸ“Š Previous Scores (18% correlation)

πŸ’‘ How It Works

Input Categories

Study Habits

  • Hours studied per week (0-50)
  • Attendance percentage (60-100%)
  • Monthly tutoring sessions (0-10)
  • Access to resources (Low/Medium/High)

Environment & Support

  • Parental involvement (Low/Medium/High)
  • Family income (Low/Medium/High)
  • Teacher quality (Low/Medium/High)
  • Internet access (Yes/No)

Personal Factors

  • Motivation level (Low/Medium/High)
  • Peer influence (Negative/Neutral/Positive)
  • Sleep hours per night (4-10)
  • Previous exam score (0-100)

Advanced Factors (optional)

  • Extracurricular activities
  • School type (Public/Private)
  • Grade level (1-4)
  • Learning disabilities
  • Gender
  • Current semester (1-8)
  • Distance from home
  • Parental education
  • Physical activity hours
  • Class participation score

Output

  • πŸ“Š Predicted Exam Score: 0-100
  • 🎯 Performance Category: Excellent/Good/Average/At Risk
  • πŸ“ˆ Confidence Intervals: Β±X points (90% & 95%)
  • πŸ’‘ Personalized Recommendations: Top 10 action items

Recommendations Generated For

  • πŸ“š Study hours optimization
  • πŸ“ Attendance improvement
  • 😴 Sleep hygiene
  • πŸƒ Physical activity
  • πŸ‘¨β€πŸ« Tutoring suggestions
  • πŸ’ͺ Motivation strategies
  • 🎨 Extracurricular involvement
  • πŸ™‹ Class participation
  • 🌐 Resource access
  • πŸ‘¨β€πŸ‘©β€πŸ‘§ Family support

πŸ“ˆ Success Patterns

High Performers (Score β‰₯ 80)

  • Study: 19+ hours/week
  • Attendance: 79%+
  • GPA: 7.0+
  • Sleep: 6-8 hours/night

Average Students (Score 60-75)

  • Study: 18 hours/week
  • Attendance: 85%
  • GPA: 5.0-7.0
  • Sleep: 7 hours/night

At-Risk Students (Score < 60)

  • Study: 10 hours/week (47% less)
  • Attendance: 64% (21% lower)
  • GPA: <3.0
  • Sleep: Irregular

🎯 Use Cases

For Students

  • πŸŽ“ Predict exam performance before studying
  • πŸ“Š Understand factors affecting grades
  • πŸ’‘ Get actionable improvement suggestions
  • πŸ“ˆ Track progress over semesters

For Educators

  • πŸ‘¨β€πŸ« Identify at-risk students early
  • πŸ“‹ Provide targeted interventions
  • πŸ“Š Analyze class performance patterns
  • 🎯 Make data-driven decisions

For Administrators

  • πŸ“ˆ Monitor institutional performance
  • πŸ” Identify resource needs
  • πŸ“Š Generate performance reports
  • 🎯 Plan academic support programs

πŸ”§ Retraining the Model

If you have new data or want to retrain:

python train_advanced.py

This will:

  1. Load and preprocess the CSV data
  2. Engineer 16 new features
  3. Train 3 models (Linear Regression, Random Forest, Gradient Boosting)
  4. Perform 5-fold cross-validation
  5. Save the best model and metrics
  6. Generate feature importance analysis

Note: Make sure StudentPerformanceFactors.csv is in the same directory.


βœ… System Verification

To verify everything is set up correctly:

python verify_system.py

Checks:

  • βœ“ Model files present
  • βœ“ Data file accessible
  • βœ“ All dependencies installed
  • βœ“ Feature compatibility
  • βœ“ Model predictions working

πŸ§ͺ Testing

Run the test suite:

python test_app.py

Tests validate:

  • Model predictions
  • Feature engineering
  • Data compatibility
  • Input validation

πŸ“Š Model Performance Comparison

Model Test RΒ² MAE RMSE Accuracy CV Mean RΒ²
Linear Regression (Selected) 1.0000 0.00 0.00 100.00% 1.0000 Β± 0.0000
Random Forest 0.9997 0.00 0.07 99.99% 0.9994 Β± 0.0004
Gradient Boosting 0.9999 0.00 0.03 100.00% 0.9998 Β± 0.0002

πŸ“₯ Data Export

Predictions can be exported as CSV with:

  • Timestamp
  • Predicted score
  • Class average
  • Percentile ranking
  • Student inputs

πŸ” Privacy & Security

  • βœ… All data processed locally (no cloud uploads)
  • βœ… No external API calls
  • βœ… Student data stored securely
  • βœ… No third-party data sharing

πŸ› Troubleshooting

App won't start

pip install --upgrade streamlit
streamlit run app_advanced.py

Missing statsmodels error

pip install statsmodels

Model file not found

python verify_system.py
# or
python train_advanced.py

Data loading error

  • Ensure StudentPerformanceFactors.csv is in the project directory
  • Check file permissions
  • Verify CSV format integrity

πŸ“š Documentation

  • TECHNICAL.md - Deep technical documentation
  • requirements.txt - All dependencies
  • In-app Help - Hover over fields for tooltips

πŸš€ Deployment

Local Deployment

streamlit run app_advanced.py

Server Deployment

streamlit run app_advanced.py --server.port 8501 --server.address 0.0.0.0

Docker (Optional)

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8501
CMD ["streamlit", "run", "app_advanced.py"]

πŸ“Š Performance Optimization

  • πŸš€ Model Caching: Models cached in memory for instant predictions
  • πŸ“Š Data Caching: CSV loaded once and cached
  • ⚑ Efficient Computation: NumPy/Pandas optimized operations
  • 🎨 UI Optimization: Lazy loading of visualizations

πŸ”„ Workflow Summary

1. User Input (24+ factors)
   ↓
2. Data Validation
   ↓
3. Feature Engineering (35 features)
   ↓
4. Model Prediction
   ↓
5. Confidence Calculation
   ↓
6. Recommendation Generation
   ↓
7. Results Display + Export

πŸ“ˆ Accuracy Assurance

  • βœ… 5-fold cross-validation ensures robustness
  • βœ… Multiple models for comparison
  • βœ… Residual analysis for uncertainty
  • βœ… Feature importance verification
  • βœ… Regular testing suite

🀝 Contributing

Contributions welcome! Areas to improve:

  • Real-time database integration
  • Email alert system for at-risk students
  • PDF report generation
  • Mobile app version
  • REST API endpoints
  • Multi-language support

πŸ“„ License

This project is licensed under the MIT License - see LICENSE file for details.


πŸ‘¨β€πŸ’» Author

Created with ❀️ for educational institutions


πŸ“ž Support & Issues

  • πŸ“§ For issues, use GitHub Issues
  • πŸ’¬ Questions? Check TECHNICAL.md
  • πŸ› Bug reports welcome

🌟 Key Highlights

✨ 100% Accurate predictions on test set
πŸš€ 35 Engineered Features for better insights
πŸ’‘ Personalized Recommendations for each student
πŸ“Š Advanced Analytics dashboard included
⚑ Lightning Fast predictions (<100ms)
πŸ”’ Secure local data processing
πŸ“± Responsive UI on all devices
🎯 Production Ready code quality


Ready to improve student performance? Get Started β†’

Releases

No releases published

Packages

No packages published

Languages