UPI Fraud Detection System

Machine learning system for detecting fraudulent UPI transactions with 87% F1-Score using XGBoost and advanced feature engineering.

Performance

Model	Precision	Recall	F1-Score	Specificity
XGBoost	90.5%	83.8%	87.0%	99.98%
Random Forest	94.6%	79.5%	86.4%	99.99%
LightGBM	87.8%	79.9%	83.6%	99.97%
Gradient Boosting	80.6%	82.7%	81.6%	99.95%

Understanding the Models

XGBoost (Extreme Gradient Boosting) - Best Overall

What it does: XGBoost is like a team of detectives where each new detective learns from the mistakes of previous ones. It builds multiple "decision trees" (think of them as flowcharts) that work together to identify fraud.

How it works:

First tree makes predictions (some right, some wrong)
Second tree focuses on fixing the mistakes of the first
Third tree fixes mistakes of the second, and so on
Final prediction combines all trees' opinions

Real-world example:

Transaction 1: Amount = ₹50,000, Time = 2 AM, Balance drops to ₹0
Tree 1 says: "Suspicious amount" → 60% fraud
Tree 2 adds: "Plus it's night time" → 75% fraud
Tree 3 adds: "And balance went to zero" → 92% fraud
Final decision: FRAUD

Performance:

90.5% Precision = Out of 100 flagged frauds, 90 are real
83.8% Recall = Catches 84 out of 100 actual frauds
Best for production use

Random Forest - Most Precise

What it does: Imagine asking 200 fraud experts to independently review a transaction. Each expert looks at different aspects (amount, time, location, history). The majority vote wins.

How it works:

Creates 200 independent decision trees
Each tree sees a random sample of data
Each tree votes: Fraud or Legitimate
Final decision = majority vote

Real-world example:

Transaction: ₹10,000 payment at 3 PM
180 trees say: "Legitimate" (normal amount, normal time)
20 trees say: "Fraud" (suspicious pattern)
Final decision: LEGITIMATE (180 > 20)

Performance:

94.6% Precision = Only 5 false alarms per 100 flags
Best when you can't afford false alarms (blocking legitimate customers is costly)

Why it's precise:

Multiple independent opinions reduce errors
Doesn't overreact to single suspicious features
Conservative approach = fewer mistakes

LightGBM (Light Gradient Boosting Machine) - Fastest

What it does: LightGBM is like XGBoost's faster cousin. It uses smart shortcuts to analyze millions of transactions quickly without sacrificing much accuracy.

How it works:

Instead of checking every transaction, it groups similar ones
Builds trees "leaf-wise" (grows the most useful branches first)
Uses histogram-based learning (buckets data for speed)

Real-world example:

Dataset: 6 million transactions
XGBoost: Takes 10 minutes to train
LightGBM: Takes 3 minutes to train
Same accuracy, 3x faster

Performance:

87.8% Precision, 79.9% Recall
Best for real-time systems where speed matters
Can handle streaming data (thousands of transactions per second)

When to use:

Large datasets (millions of rows)
Real-time fraud detection
Limited computing resources

Gradient Boosting - Reliable Baseline

What it does: Builds a team of "weak learners" (simple models) that together become a "strong learner". Each new model focuses on the hardest cases the previous models got wrong.

How it works:

Start with simple predictions (average fraud rate)
Find transactions where prediction was wrong
Build new model to fix those specific errors
Repeat 100-200 times
Combine all models

Real-world example:

Model 1: Flags high amounts → Catches 50% of frauds
Model 2: Adds time patterns → Catches 65% of frauds
Model 3: Adds balance changes → Catches 75% of frauds
Model 100: Catches 82.7% of frauds

Performance:

80.6% Precision, 82.7% Recall
Most balanced recall (catches the most frauds)
Stable and consistent across different datasets

Why we use it:

Benchmark to compare other models
Reliable fallback option
Good all-around performance

Logistic Regression - Simple & Interpretable

What it does: Uses a mathematical formula to calculate fraud probability. Assigns a "weight" to each feature based on how important it is for detecting fraud.

How it works:

Fraud Score = (Amount × 0.3) + (Night × 0.5) + (Balance_Drop × 0.8) - 2.0

If Score > 0 → Fraud
If Score < 0 → Legitimate

Real-world example: Transaction: ₹100,000 at 3 AM, balance drops from ₹100,000 to ₹0

Score = (100000 × 0.00003) + (1 × 0.5) + (100000 × 0.00008) - 2.0
      = 3.0 + 0.5 + 8.0 - 2.0
      = 9.5 (> 0)
→ FRAUD

Performance:

81.1% Precision, 49.8% Recall
Best for explaining decisions to regulators, auditors, customers

Why it's interpretable:

Can see exact contribution of each feature
"Your transaction was flagged because: high amount (30%), night time (50%), balance drop (80%)"
Easy to audit and explain
Meets regulatory requirements

When to use:

Need to explain why transaction was flagged
Regulatory compliance
Customer disputes
Building trust with stakeholders

Metrics Explained

Classification Metrics:

Precision: Of all transactions flagged as fraud, how many were actually fraud?
- Example: 90.5% = Out of 100 flagged transactions, 90 are real fraud, only 10 false alarms
- Why it matters: High precision means fewer legitimate customers get blocked
- Business impact: Reduces customer complaints and support costs
Recall: Of all actual fraud cases, how many did we catch?
- Example: 83.8% = We catch 84 out of 100 real fraud cases
- Why it matters: High recall means we don't miss many frauds
- Business impact: Prevents financial losses from undetected fraud
F1-Score: Harmonic mean of precision and recall (balanced measure)
- Example: 87% = Excellent balance between catching frauds and avoiding false alarms
- Why it matters: Single metric to compare models
- Business impact: Optimal trade-off between security and user experience
Specificity: Of all legitimate transactions, how many did we correctly identify?
- Example: 99.98% = Out of 10,000 legitimate transactions, we correctly identify 9,998
- Why it matters: Shows we rarely flag legitimate transactions as fraud
- Business impact: Maintains customer trust and satisfaction

Real-World Impact:

High Precision (90.5%): Only 1 in 10 fraud alerts is a false alarm → Less manual review needed
High Recall (83.8%): We catch 5 out of 6 fraud attempts → Saves millions in fraud losses
High Specificity (99.98%): Only 2 in 10,000 legitimate users get blocked → Happy customers
High F1-Score (87%): Best overall balance for production use → Ready to deploy

Features

Web Interface - User-friendly fraud detection dashboard
High Accuracy - 87% F1-Score with 90.5% precision
Real-time Predictions - Instant fraud detection
Feature Engineering - Automated feature creation
Imbalanced Data Handling - Undersampling with optimal thresholds
Multiple Models - 5 ML algorithms trained and compared

Quick Start

1. Installation

git clone https://github.com/Dhuvie/UPI-ML-Fraud-Detection.git
cd UPI-ML-Fraud-Detection
pip install -r requirements.txt

# For Jupyter notebooks
pip install jupyter

2. Get Dataset

Download from Kaggle:

Place CSV file in: data/raw/upi_fraud_data.csv

3. Train Models

python main.py

This will:

Preprocess data and engineer features
Train 5 ML models
Evaluate and save results

4. Launch Web Interface

python app.py

Open browser: http://localhost:8080

Usage

Web Interface

The web app provides:

Transaction input form
Real-time fraud detection
Fraud probability (0-100%)
Risk level (LOW/MEDIUM/HIGH)
Actionable recommendations

Example Transaction:

Type: PAYMENT
Amount: ₹10,000
Sender: C1234567890
Sender Old Balance: ₹50,000
Sender New Balance: ₹40,000
Receiver: M9876543210
Receiver Old Balance: ₹0
Receiver New Balance: ₹10,000

Interactive Learning (Jupyter Notebooks)

Explore the project interactively:

jupyter notebook

Available Notebooks:

01_data_exploration.ipynb - Explore dataset and fraud patterns
02_feature_engineering.ipynb - Learn feature creation techniques
03_model_training.ipynb - Train and compare models

Python API

from src.predict import FraudDetector

detector = FraudDetector('xgboost')
result = detector.predict(transaction)

print(f"Fraud: {result['is_fraud']}")
print(f"Probability: {result['fraud_probability']:.2%}")
print(f"Risk: {result['risk_level']}")

Individual Components

# Preprocess data
python src/preprocess.py

# Train models
python src/train.py

# Evaluate models
python src/evaluate.py

# Explain predictions
python src/explain.py

# Cross-validation
python src/validate.py

Project Structure

├── app.py                    # Flask web application
├── main.py                   # Complete ML pipeline
├── requirements.txt          # Dependencies
├── data/
│   ├── raw/                 # Original dataset
│   └── processed/           # Preprocessed data
├── models/                   # Trained models
├── results/                  # Metrics and plots
├── notebooks/                # Jupyter notebooks for learning
│   ├── 01_data_exploration.ipynb
│   ├── 02_feature_engineering.ipynb
│   └── 03_model_training.ipynb
├── src/
│   ├── preprocess.py        # Data preprocessing + feature engineering
│   ├── train.py             # Model training
│   ├── evaluate.py          # Model evaluation
│   ├── predict.py           # Real-time predictions
│   ├── explain.py           # Model explainability
│   └── validate.py          # Cross-validation
└── templates/
    └── index.html           # Web interface

Technical Details

Feature Engineering

Automatically creates:

Time features: Hour of day, night indicator
Amount features: Log transformation, round amount detection
Balance features: Balance changes for sender/receiver

Data Balancing

Training set: Undersampled to 3:1 ratio (25% fraud)
Test set: Kept imbalanced (0.13% fraud) for realistic evaluation
Why: Models learn fraud patterns better with balanced training data

Threshold Optimization

Default 0.5 threshold → Optimized threshold (0.94-0.99)
Maximizes F1-Score for each model
Impact: Precision improved from 15% to 90.5%

Model Training

Models trained with:

Class weights for imbalance handling
Regularization (subsample, colsample)
Optimal hyperparameters
5-fold cross-validation

Results

After running python main.py, check:

results/metrics.csv - Performance comparison
results/confusion_matrices.png - Visual confusion matrices
results/metrics_comparison.png - Metrics bar charts
results/feature_importance_xgboost.png - Feature importance

API Endpoint

REST API for programmatic access:

curl -X POST http://localhost:8080/api/predict \
  -H "Content-Type: application/json" \
  -d '{
    "step": 1,
    "type": "PAYMENT",
    "amount": 9000.60,
    "nameOrig": "C1231006815",
    "oldbalanceOrg": 170136.0,
    "newbalanceOrig": 161136.0,
    "nameDest": "M1979787155",
    "oldbalanceDest": 0.0,
    "newbalanceDest": 0.0,
    "isFlaggedFraud": 0
  }'

Response:

{
  "is_fraud": false,
  "fraud_probability": 0.023,
  "risk_level": "LOW"
}

Requirements

Python 3.7+
numpy
pandas
scikit-learn
xgboost
lightgbm
matplotlib
seaborn
imbalanced-learn
joblib
flask

Security

The application includes:

Input validation (negative amounts rejected)
Transaction type whitelisting
Error handling
Request size limits (16MB)

Troubleshooting

Models not found:

Error: Models not found. Run 'python main.py' first.

Solution: Train models with python main.py

Port already in use:

Address already in use

Solution: Change port in app.py line 199 to port=3000 or port=9000

Import errors:

ModuleNotFoundError: No module named 'flask'

Solution: pip install -r requirements.txt

Contributing

Contributions are welcome! Here's how you can help:

How to Contribute

Fork the repository

git clone https://github.com/Dhuvie/UPI-ML-Fraud-Detection.git
cd UPI-ML-Fraud-Detection

Create a feature branch

git checkout -b feature/your-feature-name

Make your changes
- Follow existing code style
- Add comments and docstrings
- Test your changes

Commit your changes

git add .
git commit -m "Add: your feature description"

Push and create Pull Request

git push origin feature/your-feature-name

Adding New Features

1. Adding a New ML Model

Create a new model in src/train.py:

from sklearn.ensemble import AdaBoostClassifier

# Add to models dictionary
models = {
    'adaboost': AdaBoostClassifier(
        n_estimators=100,
        learning_rate=0.1,
        random_state=42
    )
}

2. Adding New Features

Add feature engineering in src/preprocess.py:

def engineer_features(df):
    # Your new feature
    df['transaction_velocity'] = df.groupby('nameOrig')['step'].diff()
    df['amount_to_balance_ratio'] = df['amount'] / (df['oldbalanceOrg'] + 1)
    return df

3. Adding API Endpoints

Add new routes in app.py:

@app.route('/api/batch_predict', methods=['POST'])
def batch_predict():
    """Predict multiple transactions"""
    transactions = request.get_json()
    results = [predict_single(t) for t in transactions]
    return jsonify(results)

4. Adding Visualizations

Create new plots in src/explain.py:

def plot_fraud_timeline(df):
    """Plot fraud cases over time"""
    fraud_by_time = df[df['isFraud']==1].groupby('step').size()
    plt.plot(fraud_by_time)
    plt.title('Fraud Cases Over Time')
    plt.savefig('results/fraud_timeline.png')

Code Guidelines

Style: Follow PEP 8
Docstrings: Add to all functions
Comments: Explain complex logic
Testing: Test before submitting
Commits: Use clear, descriptive messages

Commit Message Format

Type: Brief description

- Detailed point 1
- Detailed point 2

Types:

Add: New feature
Fix: Bug fix
Update: Modify existing feature
Remove: Delete code/feature
Docs: Documentation changes

Examples

Good commits:

Add: Transaction velocity feature

- Calculate time between transactions
- Add to feature engineering pipeline
- Improves F1-Score by 2%

Bad commits:

update stuff
fixed bug
changes

What to Contribute

High Priority:

Additional ML models (Neural Networks, SVM)
Real-time streaming predictions
Model deployment scripts
API authentication
Rate limiting
Batch prediction endpoint

Medium Priority:

Additional visualizations
Performance optimizations
More feature engineering
Cross-validation improvements
Hyperparameter tuning

Documentation:

Tutorial notebooks
API documentation
Deployment guides
Use case examples

Testing Your Changes

Before submitting:

# Test ML pipeline
python main.py

# Test web app
python app.py

# Test notebooks
jupyter notebook

Questions?

Open an issue for:

Bug reports
Feature requests
Questions
Discussions

License

MIT License - see LICENSE file

Author

Dhuvie - GitHub

Acknowledgments

Kaggle for datasets
scikit-learn, XGBoost, LightGBM communities
Flask framework

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data/raw		data/raw
notebooks		notebooks
src		src
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt

License

Dhuvie/UPI-ML-Fraud-Detection

Folders and files

Latest commit

History

Repository files navigation

UPI Fraud Detection System

Performance

Understanding the Models

XGBoost (Extreme Gradient Boosting) - Best Overall

Random Forest - Most Precise

LightGBM (Light Gradient Boosting Machine) - Fastest

Gradient Boosting - Reliable Baseline

Logistic Regression - Simple & Interpretable

Metrics Explained

Features

Quick Start

1. Installation

2. Get Dataset

3. Train Models

4. Launch Web Interface

Usage

Web Interface

Interactive Learning (Jupyter Notebooks)

Python API

Individual Components

Project Structure

Technical Details

Feature Engineering

Data Balancing

Threshold Optimization

Model Training

Results

API Endpoint

Requirements

Security

Troubleshooting

Contributing

How to Contribute

Adding New Features

1. Adding a New ML Model

2. Adding New Features

3. Adding API Endpoints

4. Adding Visualizations

Code Guidelines

Commit Message Format

Examples

What to Contribute

Testing Your Changes

Questions?

License

Author

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages