Skip to content

philippe-heitzmann/LendingClub_ML_App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

34 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LendingClub ML App

Python Docker License Build Status

A comprehensive machine learning application for predicting loan defaults and optimizing investment portfolios using the LendingClub dataset.

Project Overview

This project demonstrates advanced machine learning techniques applied to financial risk assessment. The application trains multiple classification models on historical LendingClub loan data to predict default probabilities, then uses these predictions to construct an IRR-optimized investment portfolio.

Key Features

  • Multiple ML Models: 8+ different algorithms including Logistic Regression, Random Forest, Gradient Boosting, and Neural Networks
  • Interactive Dashboard: Real-time visualization of loan data, model performance, and portfolio optimization
  • Portfolio Optimization: IRR-based portfolio construction with customizable investment criteria
  • Live Predictions: Real-time loan default predictions via REST API
  • Advanced Analytics: Comprehensive EDA with interactive choropleth maps and statistical analysis

Business Impact

  • 7.40% IRR for 36-month loans (vs. 6.30% baseline)
  • 10.63% IRR for 60-month loans (vs. 8.11% baseline)
  • 1.51% and 0.99% alpha over baseline for 36-month and 60-month loans respectively
  • Statistically significant results at 1% confidence level

Quick Start

Prerequisites

  • Docker & Docker Compose
  • Python 3.9+ (for local development)
  • Git

Installation & Running

Option 1: Docker Compose (Recommended)

# Clone the repository
git clone https://github.com/yourusername/LendingClub_ML_App.git
cd LendingClub_ML_App

# Run the entire application
bash build_e2e.sh

Option 2: Manual Docker Build

# Build and run backend
docker build -t flask_backend:v1 -f ./app/backend/Dockerfile.backend .
docker run -d -p 5000:5000 --name flask_backend flask_backend:v1

# Build and run frontend
docker build -t dash_frontend:v1 -f ./app/frontend/Dockerfile.frontend .
docker run -d -p 8050:8050 --name dash_frontend dash_frontend:v1

Option 3: Local Development

# Backend
cd app/backend
pip install -r requirements_backend.txt
python flask_serve.py

# Frontend (in another terminal)
cd app/frontend
pip install -r requirements_frontend.txt
python app.py

Access the Application

Application Screenshots

1. Interactive Choropleth Map - Loan Default Rates by State

Choropleth Map showing loan default rates by state

2. FICO Score Analysis - Default Rates & Interest Rates

Line plots showing relationship between FICO scores, default rates, and interest rates

3. Real-time ML Predictions Interface

Interactive interface for real-time loan default predictions

πŸ”§ API Documentation

Prediction Endpoint

POST /api/v1/predict

Predict loan default probability using trained ML models.

Request Body

{
    "query": [[feature1, feature2, ..., featureN]],
    "model": "GBC"
}

Response

{
    "prediction": "No Default",
    "confidence": [0.123, 0.877]
}

Available Models

  • QDA - Quadratic Discriminant Analysis
  • LDA - Linear Discriminant Analysis
  • LOGIT - Logistic Regression
  • GBC - Gradient Boosting Classifier

Example Usage

curl -X POST http://localhost:5000/api/v1/predict \
  -H "Content-Type: application/json" \
  -d '{"query": [[50000, 700, 5, 10]], "model": "GBC"}'

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend      β”‚    β”‚   Backend       β”‚    β”‚   Data Layer    β”‚
β”‚   (Dash/Flask)  │◄──►│   (Flask API)   │◄──►│   (Pickle Files)β”‚
β”‚   Port: 8050    β”‚    β”‚   Port: 5000    β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ˆ Model Performance

Model Non-2018 AUC 2018 AUC Performance
CatBoost Classifier 0.892 0.841 πŸ₯‡ Best
MLP Neural Net 0.884 0.816 πŸ₯ˆ Excellent
Gradient Boosting 0.831 0.766 πŸ₯‰ Good
Random Forest 0.769 0.697 βœ… Good

πŸ› οΈ Development

Project Structure

LendingClub_ML_App/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ backend/           # Flask API server
β”‚   β”‚   β”œβ”€β”€ flask_serve.py
β”‚   β”‚   β”œβ”€β”€ requirements_backend.txt
β”‚   β”‚   └── Dockerfile.backend
β”‚   β”œβ”€β”€ frontend/          # Dash web application
β”‚   β”‚   β”œβ”€β”€ app.py
β”‚   β”‚   β”œβ”€β”€ constants/
β”‚   β”‚   β”œβ”€β”€ requirements_frontend.txt
β”‚   β”‚   └── Dockerfile.frontend
β”‚   └── data/              # ML models and datasets
β”œβ”€β”€ notebooks/             # Jupyter notebooks for EDA
β”œβ”€β”€ presentation/          # Project presentation materials
β”œβ”€β”€ docker-compose.yml     # Multi-container orchestration
└── build_e2e.sh          # End-to-end build script

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Philippe Heitzmann

  • Email: philheitz6[at]gmail[dot]com
  • LinkedIn: [Your LinkedIn Profile]
  • GitHub: @yourusername

⭐ Star this repository if you found it helpful!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published