Skip to content

Falak-Parmar/P2P-lending-risk-assesment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

101 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Peer-to-Peer Lending Risk Management

A production-grade machine learning system for predicting loan default risk in P2P lending. This project implements a full end-to-end pipeline including data cleaning, feature engineering, validation, and a Stacked Ensemble Model (XGBoost, LightGBM, CatBoost).

It also features an Interactive Dashboard for real-time risk assessment.


📌 Table of Contents

  • 📄 Overview
  • ⭐ Features
  • 📊 Interactive Dashboard
  • 🔧 Configuration
  • ⚙️ Installation & Usage
  • 🏗 Architecture
  • 📜 License

📄 Overview

The goal is to help investors minimize risk. This system analyzes loan application data (based on Lending Club 2007-2020) to predict the probability of default.

Key capabilities:

  • Robust Pipeline: Automated cleaning, capping, and feature engineering.
  • Advanced Modeling: A meta-model stacking approach that outperforms individual classifiers.
  • Real-time Inference: A Streamlit app to test scenarios instantly.

⭐ Features

  • 🧠 Stacked Ensemble: Combines XGBoost, LightGBM, and CatBoost via a meta-learner.
  • 📱 Interactive Dashboard: Web interface for business users to test loan applicants.
  • ⚙️ Configurable: Hyperparameters are managed via params.yaml, not hardcoded.
  • 🔒 Reproducible: Pinned dependency versions and seed control.
  • 🛡 Leakage-Free: Strict separation of fitting and transforming.

📊 Interactive Dashboard

Test the model in real-time using the Streamlit app.

streamlit run app.py

Capabilities:

  • Input financial details (FICO, Income, Loan Amount, etc.)
  • Visualize risk score via gauge charts.
  • Get instant "Approve" or "Reject" recommendations.

🔧 Configuration

All model hyperparameters are stored in params.yaml. You can modify them without touching the code.

For fast testing (smoke tests), you can use the lightweight config:

PARAMS_PATH=params_fast.yaml python main.py

⚙️ Installation & Usage

1. Installation

Clone the repo and install dependencies:

git clone https://github.com/Falak-Parmar/P2P-lending-risk-assesment.git
cd P2P-lending-risk-assesment
pip install -r requirements.txt

2. Run Pipeline

To train the model from scratch (cleaning -> engineering -> training):

python main.py

Artifacts will be saved in models/ and data/ directories.

3. Run Dashboard

streamlit run app.py

🏗 Project Structure

.
├── app.py                  # Streamlit Dashboard entry point
├── main.py                 # Pipeline Orchestrator
├── params.yaml             # Model Hyperparameters
├── requirements.txt        # Pinned dependencies
├── data/                   # Data storage (Raw, Cleaned, Processed)
├── models/                 # Saved model artifacts (.pkl)
├── logs/                   # Execution logs
├── src/                    # Pipeline Source Code
│   ├── data_cleaning.py
│   ├── data_feature_engineering.py
│   ├── data_preprocessing.py
│   └── model.py
└── utils/                  # Utility functions
    ├── stacking.py         # Ensemble logic
    └── ...

📜 License

MIT License

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors