A comprehensive machine learning system for personalized career recommendations based on user skills, interests, and personality traits.
This project implements a multi-label classification system that predicts suitable careers with confidence scores, featuring:
- Advanced feature engineering with skill clustering
- Multi-label classification with Random Forest and XGBoost
- Intelligent confidence scoring with validation framework
- Production-ready FastAPI deployment
assig-ai-tut/
├── data/
│ └── synthetic_user_profiles_large.csv
├── notebooks/
│ ├── 01_data_engineering.ipynb
│ └── 02_model_development.ipynb
├── src/
│ ├── preprocessing.py
│ ├── feature_engineering.py
│ ├── model_trainer.py
│ ├── confidence_scorer.py
│ └── api.py
├── models/
│ └── career_recommender_v1.pkl
├── tests/
│ └── test_api.py
├── requirements.txt
└── README.md
pip install -r requirements.txtjupyter notebook notebooks/01_data_engineering.ipynb
jupyter notebook notebooks/02_model_development.ipynbuvicorn src.api:app --reload --port 8000- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
pytest tests/ -v- Hamming Loss: < 0.15
- Precision@3: > 0.80
- Label Ranking Average Precision: > 0.85
import requests
response = requests.post(
"http://localhost:8000/predict",
json={
"skills": ["Python", "Communication"],
"interests": ["Technology", "Management"],
"personality": {
"analytical": 0.8,
"creative": 0.4,
"social": 0.7
},
"education": "Bachelor",
"experience": 3
}
)
print(response.json())- Skill clustering (technical/soft skills)
- Interest profiling
- Personality trait normalization
- Class imbalance handling
- Feature importance analysis
- Multi-label classification
- Hyperparameter optimization
- Probability calibration
- Comprehensive evaluation metrics
- Error analysis
- Hybrid scoring (model + heuristics)
- Rule-based adjustments
- Validation framework
- Quality assurance metrics
- FastAPI implementation
- Input validation
- Model versioning
- Comprehensive testing
- OpenAPI documentation
Built as part of AI/ML Practice assignment demonstrating end-to-end ML system development.