This project is developed as part of the CodeAlpha Machine Learning Internship.
The objective is to build a Credit Scoring Model that predicts an individual's creditworthiness based on historical financial data.
Credit scoring plays a crucial role in determining whether a customer is eligible for a loan or credit.
This project leverages Machine Learning to classify customers as "Good Credit" or "Bad Credit",
helping financial institutions make informed decisions.
codaalpha_credit_scoring_model.ipynbβ Jupyter Notebook containing the full code (data preprocessing, modeling, evaluation).loan_data.csvβ Dataset used for training and testing the model.README.mdβ Documentation for the project.
- Programming Language: Python π
- Libraries & Tools:
- pandas, numpy β Data handling
- matplotlib, seaborn β Visualization
- scikit-learn β Preprocessing, Modeling, Evaluation
-
Data Preprocessing
- Handle missing values
- Encode categorical features
- Feature scaling (Normalization/Standardization)
-
Outlier Detection & Handling
- IQR & Z-Score methods
-
Feature Engineering
- Feature Selection (Chi-Square test)
- Dimensionality Reduction (PCA)
-
Modeling
- Logistic Regression
- Decision Tree
- Random Forest
-
Evaluation Metrics
- Accuracy
- Precision, Recall, F1-Score
- ROC-AUC Curve
- The best-performing model was Random Forest Classifier,
achieving high accuracy and balanced performance across precision & recall.
- Clone this repository:
git clone https://github.com/YourUsername/CodeAlpha_CreditScoring.git cd CodeAlpha_CreditScoring