Goal: The objective of this project is to build a robust machine learning classification engine to predict whether a loan applicant should be Approved or Rejected.
Manual loan approval processes are time-consuming and prone to human error. By analyzing historical data—including applicant income, credit score, debt levels, and loan intent—this system automates the decision-making process. This solution aims to help financial institutions minimize default risks while accelerating the approval workflow for eligible candidates.
The dataset consists of financial and demographic information about loan applicants. It satisfies the assignment requirement of having a minimum of 12 features and over 500 instances (Total: 50,000 instances and 18 feature excluding primary key and target).
- Target Variable:
loan_status(Binary: 0 = Rejected, 1 = Approved) - Key Features:
- Demographics:
age,occupation_status,years_employed. - Financial Health:
annual_income,savings_assets,current_debt. - Credit History:
credit_score,credit_history_years,defaults_on_file,delinquencies_last_2yrs,derogatory_marks. - Loan Details:
loan_amount,interest_rate,loan_intent,product_type. - Ratios:
debt_to_income_ratio,loan_to_income_ratio,payment_to_income_ratio.
- Demographics:
Six different classification algorithms were implemented and evaluated to find the best-performing model.
| ML Model Name | Accuracy | AUC Score | Precision | Recall | F1 Score | MCC Score |
|---|---|---|---|---|---|---|
| Logistic Regression | 0.7770 | 0.8599 | 0.7767 | 0.7770 | 0.7760 | 0.5476 |
| Decision Tree | 0.8774 | 0.9058 | 0.8778 | 0.8774 | 0.8770 | 0.7520 |
| kNN Classifier | 0.5737 | 0.5883 | 0.5762 | 0.5737 | 0.5745 | 0.1434 |
| Naive Bayes | 0.7811 | 0.8684 | 0.7846 | 0.7811 | 0.7816 | 0.5630 |
| Random Forest | 0.9061 | 0.9712 | 0.9062 | 0.9061 | 0.9061 | 0.8105 |
| XGBoost | 0.9262 | 0.9833 | 0.9264 | 0.9262 | 0.9261 | 0.8508 |
| ML Model Name | Observation about model performance |
|---|---|
| Logistic Regression | Performed moderately (77.7% Accuracy). It struggled to capture the complex, non-linear relationships between financial variables like debt ratios and loan intent. |
| Decision Tree | Significant improvement over linear models (87.7% Accuracy). This suggests the dataset has strong non-linear decision boundaries that a simple split-based approach can capture effectively. |
| kNN | Poor performance (57.4% Accuracy). The model struggled significantly, likely due to the "curse of dimensionality" where distance-based metrics become less effective with many financial features. |
| Naive Bayes | Comparable to Logistic Regression (78.1% Accuracy). Its assumption of feature independence is likely violated by correlated features (e.g., loan_amount vs loan_to_income_ratio), limiting its potential. |
| Random Forest | Excellent performance (90.6% Accuracy). The ensemble approach successfully reduced the variance seen in the single Decision Tree, providing a robust and high-confidence classifier. |
| XGBoost | Best overall model (92.6% Accuracy, 0.85 MCC). It achieved the highest AUC (0.98), indicating superior ability to distinguish between approved and rejected applicants. It effectively handled class imbalances and complex feature interactions. |
This repository follows the required folder structure:
BITS-ML-WILP-SEM1-Assignment-2/
│
├── model/ # Saved trained models (*.joblib)
├── preprocessing/ # Saved pipeline preprocessors (*.pkl)
├── app.py # Streamlit Application (Frontend)
├── main.py # Training Pipeline Script
├── requirements.txt # Project Dependencies
├── Makefile # Automation commands
├── Dockerfile # Containerization setup
└── README.md # Project Documentation
git clone https://github.com/neelkantnewra/Loan-Approval-Prediction.git
cd Loan-Approval-PredictionEnsure that all the dependent file are installed with right Python version
pip install -r requirements.txtstreamlit run app.py
The app will open in your browser at http://localhost:8501
The application is deployed on Streamlit Community Cloud and can be accessed here:
https://loan-approval-prediction-bits-wilp-2025ab05299.streamlit.app/
Screenshot of Lab , to prove that project is build inside the BITS environment
- GitHub repo link works
- Streamlit app link opens correctly
- App loads without errors
- All required features implemented
- README.md updated and added in the submitted PDF
- Screenshot attached