Domain: Education Analytics & Data Mining
Objective: Advanced Data Mining and Business Intelligence analysis of student placement data with predictive modeling
Dataset: 10,000 student records with 12 core features plus 6 engineered features
Overall Placement Rate: 42.0%
Technology Stack: Python, Scikit-learn, Streamlit, Pandas, NumPy
π₯ Project Demo Video: Watch Placelytics in Action
π Complete project overview and detailed analysis available in Placelytics - Report
- Logistic Regression: 80.15% accuracy, 86.95% AUC (Best Overall)
- Random Forest: 77.90% accuracy, 85.20% AUC
- Gradient Boosting: 79.65% accuracy, 86.65% AUC
- Ensemble Model: Weighted combination for enhanced predictions
- Competency Score (19.08%) - Engineered feature combining aptitude and soft skills
- HSC Marks (17.00%) - Higher secondary education performance
- Academic Index (14.57%) - Weighted academic performance metric
- Aptitude Test Score (10.73%) - Technical assessment results
- SSC Marks (9.53%) - Secondary education foundation
- Academic Excellence Impact: Students with CGPA β₯ 8.0 have 68.2% placement rate
- Experience Multiplier: Students with β₯2 internships achieve 69.7% placement rate
- Skill Development: Students with aptitude β₯85 have 75.6% placement rate
- Training Effectiveness: Placement training increases success by 1.4x
- Risk Identification: 4,657 students (46.6%) identified as at-risk
π Complete business intelligence analysis and insights available in Placelytics - Report
- Cluster 0 - Moderate Performers: 2,767 students, 49.2% placement rate
- Cluster 1 - Developing Students: 1,794 students, 5.7% placement rate
- Cluster 2 - Average Achievers: 2,640 students, 16.6% placement rate
- Cluster 3 - High Achievers: 2,799 students, 82.0% placement rate
- Core Processing: pandas, numpy, scipy
- Machine Learning: scikit-learn (multiple algorithms)
- Visualization: matplotlib, seaborn, plotly
- Dashboard: streamlit
- Statistical Analysis: scipy.stats
- Feature Engineering: Academic Index, Experience Score, Competency Score
- Clustering Analysis: K-Means student segmentation
- Statistical Testing: Chi-square tests for feature association
- Dimensionality Reduction: PCA analysis
- Risk Analytics: Multi-factor risk scoring
- Correlation Mining: Feature relationship analysis
- Performance Tiers: Student classification system
π Detailed technical methodology available in Placelytics - Report
- Data Quality Assessment (zero missing values)
- Categorical encoding (Label Encoder)
- Feature engineering (6 derived attributes)
- Statistical analysis and hypothesis testing
- Clustering and segmentation
- Model training with cross-validation
- Ensemble prediction system
This section showcases all the key visualizations and dashboard interfaces available in Placelytics:
Complete overview of the main dashboard showing KPI metrics, placement rate analysis by performance tiers, CGPA distribution pie chart, and stacked count visualizations for comprehensive placement insights.
Main predictive analytics dashboard displaying model performance comparison, feature importance analysis, and correlation heatmaps for comprehensive ML insights.
Detailed prediction results showing individual student placement probability with confidence levels, academic indices, performance tiers, and personalized recommendations.
Horizontal bar chart displaying the relative importance of different features in the ensemble model, highlighting which academic and experience factors most strongly predict placement success.
Interactive correlation matrix showing relationships between all numeric features, helping identify multicollinearity and feature interactions in the dataset.
Three-dimensional scatter plot visualization of K-means clustering results, showing how students are segmented based on Academic Index, Experience Score, and Competency Score.
Pie chart showing the distribution of placement rates across different CGPA categories, providing insights into academic performance impact on placement success.
Overlapping histogram comparing CGPA distributions between placed and not-placed students, revealing the academic performance patterns that influence placement outcomes.
Box plot analysis showing CGPA quartiles, medians, and outliers for both placed and not-placed student groups, highlighting the statistical differences in academic performance.
Line chart demonstrating how placement rates vary across different CGPA ranges, showing the clear correlation between academic performance and employment success.
Bar chart illustrating the strong positive correlation between internship experience and placement rates, emphasizing the importance of practical work experience.
Distribution chart showing the frequency of different risk scores across the student population, helping identify the proportion of students in various risk categories.
Line graph demonstrating the inverse relationship between risk scores and placement rates, validating the effectiveness of the risk assessment model in identifying at-risk students.
placement_analysis.py- Basic ML analysis with 4 algorithmsadvanced_dmbi_analysis.py- Comprehensive DMBI implementation with 8 analysis phasesCollege_Placement_Analysis.ipynb- Jupyter notebook with detailed analysisplacementdata.csv- Dataset with 10,000 student records- Placelytics - Report - Comprehensive project report with detailed analysis and findings
dmbi_dashboard.py- Advanced Business Intelligence dashboard (Main Application)placement_dashboard.py- Basic prediction interface
validate_predictions.py- Comprehensive model validation scriptquick_validation.py- Fast accuracy testingcreate_visualizations.py- Advanced visualization generation
requirements.txt- Python package dependenciesstart_dashboard.sh- Easy startup script for the dashboard.venv/- Python virtual environment with all dependencies.gitignore- Git ignore patterns for clean repository
README.md- This comprehensive project documentation- Placelytics - Report - Detailed technical report with complete analysis, methodology, and findings
- Python 3.10+ installed on your system
- Git installed for cloning the repository
- Terminal/Command Prompt access
git clone https://github.com/parthnarkar/Placelytics-DMBI.git
cd Placelytics-DMBI# Create virtual environment
python -m venv .venv
# Activate virtual environment
# On Linux/macOS:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate# Install required packages
pip install -r requirements.txtFor Windows:
# Run the dashboard using the full path to streamlit
.venv\Scripts\streamlit.exe run dmbi_dashboard.py --server.port 8502
# Alternative: Activate virtual environment first, then run
.venv\Scripts\activate
streamlit run dmbi_dashboard.py --server.port 8502For Linux/macOS:
# Make startup script executable
chmod +x start_dashboard.sh
# Run the dashboard (easiest method)
./start_dashboard.sh
# Alternative: Run directly with Python
streamlit run dmbi_dashboard.py --server.port 8502For Windows:
# Navigate to project directory
cd D:\Projects\Placelytics-DMBI
# Option 1: Direct run with full path
.venv\Scripts\streamlit.exe run dmbi_dashboard.py --server.port 8502
# Option 2: Activate environment first
.venv\Scripts\activate
streamlit run dmbi_dashboard.py --server.port 8502For Linux/macOS:
# Navigate to project directory
cd /path/to/Placelytics-DMBI
# Run the startup script (easiest method)
./start_dashboard.sh# Navigate to project directory
cd /path/to/Placelytics-DMBI
# Activate virtual environment
source .venv/bin/activate
# Install/update dependencies
pip install -r requirements.txt
# Run the DMBI dashboard
streamlit run dmbi_dashboard.py --server.port 8502For Windows:
# Direct command using full path to streamlit
D:\Projects\Placelytics-DMBI\.venv\Scripts\streamlit.exe run dmbi_dashboard.py --server.port 8502
# With activated virtual environment
.venv\Scripts\activate
streamlit run dmbi_dashboard.py --server.port 8502
# Basic analysis script
python placement_analysis.py
# Advanced DMBI analysis
python advanced_dmbi_analysis.py
# Basic dashboard (different port)
.venv\Scripts\streamlit.exe run placement_dashboard.py --server.port 8501
# Validation testing
python quick_validation.pyFor Linux/macOS:
# Direct command (if virtual environment is active)
/path/to/Placelytics-DMBI/.venv/bin/python -m streamlit run dmbi_dashboard.py --server.port 8502
# Basic analysis script
python placement_analysis.py
# Advanced DMBI analysis
python advanced_dmbi_analysis.py
# Basic dashboard (different port)
streamlit run placement_dashboard.py --server.port 8501
# Validation testing
python quick_validation.py- Executive Dashboard: KPIs and performance metrics
- Predictive Analytics: Real-time placement probability prediction
- Student Segmentation: Cluster analysis with 3D visualization
- Feature Analysis: Individual feature impact assessment
- Risk Analytics: At-risk student identification
- Trend Analysis: Business intelligence insights
- Main Placelytics Dashboard: http://localhost:8502
- Network Access: http://192.168.29.101:8502 (accessible from local network)
- Basic Dashboard: http://localhost:8501 (if running placement_dashboard.py)
Common Issues and Solutions:
Windows-specific:
- 'streamlit' is not recognized: Use the full path
.venv\Scripts\streamlit.exeor activate the virtual environment first with.venv\Scripts\activate - Path issues: Ensure you're using backslashes
\for Windows paths - Permission errors: Run Command Prompt as Administrator if needed
General Issues:
- Ensure you're in the correct project directory
- Check that
placementdata.csvexists in the project root (should be indata/folder) - Verify virtual environment is activated:
- Windows:
.venv\Scripts\activate - Linux/macOS:
source .venv/bin/activate
- Windows:
- Install/reinstall dependencies:
pip install -r requirements.txt - For Linux/macOS: Use the startup script for automated setup:
./start_dashboard.sh
- Maintain CGPA above 8.0 for significantly higher placement chances
- Complete at least 2 internships for optimal experience score
- Achieve aptitude test scores above 85
- Develop soft skills rating to 4.5+
- Participate in placement training programs
- Engage in extracurricular activities
- Earn industry-relevant certifications
- Academic Focus: Emphasize HSC/SSC performance as foundation
- Mandatory Internships: Implement structured internship programs
- Skill Development: Enhance aptitude and soft skills training
- Early Intervention: Use risk analytics to identify struggling students
- Data-Driven Decisions: Leverage clustering insights for personalized guidance
- Performance Tracking: Monitor competency scores and academic indices
- Individual student placement probability
- Risk score calculation (0-5 scale)
- Performance tier classification
- Similar student comparison
- Recommendation engine for improvement
- Executive KPI dashboard
- Student segmentation analysis
- Feature correlation mining
- Statistical significance testing
- Trend analysis and forecasting
- Cross-Validation: 5-fold CV with stratification
- Test Scenarios: High, Average, and At-Risk student profiles
- Accuracy Verification: Predictions align with similar student outcomes
- Ensemble Reliability: Weighted model combination for robust predictions
- Zero missing values in dataset
- Balanced train-test splits
- Statistical significance of feature associations
- Clustering validation with silhouette analysis
- Comprehensive error analysis and model diagnostics
π For detailed validation methodology and results, refer to Placelytics - Report
- Prediction Accuracy: 80.15% reliable for institutional decision-making
- Risk Identification: Early warning system for 46.6% at-risk students
- Resource Optimization: Focus interventions on high-impact factors
- Placement Improvement: Potential 15-20% increase in success rates
- Evidence-Based Decisions: Replace intuition with data-driven insights
- Personalized Education: Tailor support based on student clusters
- Competitive Advantage: Enhanced institutional placement statistics
- Industry Readiness: Better alignment with market demands
- Deep Learning Models: Neural networks for complex pattern recognition
- Real-time Integration: Live data feeds from academic systems
- Advanced Ensembles: Stacking and blending techniques
- Time Series Analysis: Temporal trends in placement patterns
- External Data Integration: Industry trends and economic indicators
- Predictive Dashboards: Real-time monitoring systems
- Mobile Applications: Student self-assessment tools
- API Development: Integration with institutional systems
- Advanced Visualization: Interactive 3D cluster analysis
- Automated Reporting: Scheduled business intelligence reports
- β All dependencies properly configured
- β Data file paths corrected
- β Virtual environment optimized
- β Dashboard fully functional
- β Error-free execution verified
- β Network accessibility confirmed
This comprehensive DMBI project successfully demonstrates advanced data mining and business intelligence techniques applied to educational analytics. The system provides actionable insights through sophisticated clustering analysis, predictive modeling, and risk assessment.
The combination of technical excellence (80.15% accuracy), business intelligence capabilities, and interactive visualization makes this a complete solution for educational institutions seeking to enhance their placement programs through data-driven decision making.
π For complete project documentation, detailed methodology, results analysis, and technical specifications, please refer to Placelytics - Report
π₯ Watch the complete project demo: Placelytics in Action - YouTube Video
Project Status: COMPLETED SUCCESSFULLY & FULLY OPERATIONAL
Last Updated: October 23, 2025 - All issues resolved, system running smoothly
Built by Parth Narkar (@parth.builds)
