Anti-Money Laundering Detection System

Production-Ready AML Detection for Banking Compliance

Project Overview

A machine learning-powered Anti-Money Laundering (AML) detection system with an interactive Streamlit dashboard. Built using IBM's Synthetic AML Dataset (31.9M transactions), this system achieves 73% precision and ROC-AUC 97.8%, making it production-ready for banking compliance teams.

Live Demo

Visit Live Dashboard

Key Features

Real-time Transaction Scoring: Instant risk assessment for individual transactions
Calibrated Risk Scores: Reliable probability estimates (when model says 10% risk, it means 10%)
Automatic Feature Engineering: Extracts 20+ risk indicators from raw transaction data
Production-Optimized: 27x reduction in false positives vs baseline models
Interactive Dashboard: User-friendly interface for compliance analysts

Model Performance

Precision: 72.75% (73% of alerts are true laundering cases)
Recall: 18.87% (catches 1 in 5 laundering cases)
ROC-AUC: 97.81% (excellent discrimination)
False Positive Rate: 0.01% (only 752 false alarms out of 6.4M normal transactions)
Daily Alert Volume: ~860 alerts (manageable workload)

Project Structure

├── app/                    # Streamlit dashboard
│   ├── Home.py            # Main page with KPIs and model overview
│   ├── pages/
│   │   ├── 01_Model_Validation.py      # Performance metrics & confusion matrix
│   │   ├── 02_Investigator_Workbench.py # Real-time transaction scoring
│   │   └── 04_Data_Insight.py          # EDA visualizations
├── models/                 # Trained model artifacts
│   ├── calibrated_lightgbm_model.pkl
│   ├── scaler.pkl
│   ├── feature_names.json
│   └── model_config.json
├── note/              # Jupyter notebooks for model development
├── data/                   # find it on IBM website
└── requirements.txt        # Python dependencies

Tech Stack

ML Framework: LightGBM with Isotonic Calibration
Dashboard: Streamlit
Data Processing: Pandas, NumPy
Visualization: Plotly
Deployment: Aws is cooming soon

Dataset

Source: IBM Synthetic AML Dataset
Size: 31,898,238 transactions
Period: September 1-28, 2022
Laundering Cases: 35,230 (0.11%)
Class Imbalance: 1:905 ratio

Team Members

Catholic University of America - MDA Capstone Project Fall 2025

Delphin Kaduli
Tycho Janssen
Solomon Pinto

Quick Start

Prerequisites

Python 3.8+
pip

Installation

# Clone the repository
git clone https://github.com/DelphinKdl/CUA-MDA-Capstone-BaaS-Risk-Monitoring.git
cd CUA-MDA-Capstone-BaaS-Risk-Monitoring

# Create virtual environment
python -m venv .venv

# Activate virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.\.venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the dashboard
streamlit run app/Home.py

The dashboard will open in your browser at http://localhost:8501

Dashboard Pages

1. Home

Key performance indicators (KPIs)
Champion model overview
Business impact metrics

2. Model Validation

Confusion matrix visualization
Performance metrics
Model configuration details

3. Investigator Workbench

Star Feature: Real-time transaction risk scoring
User-friendly input form
Automatic feature engineering
Instant risk assessment with recommendations

4. Data Insights

Dataset overview and statistics
Class distribution analysis
Payment format patterns
Temporal patterns (day of week)
Structuring detection analysis

Key Risk Indicators Detected

ACH Payment Format: 49x baseline risk
Weekend Transactions: 3x baseline risk
Structuring Pattern: $9K-$10K range (8.3x risk)
High-Risk Institutions: Bank-level risk patterns
Currency Risk Patterns: UK Pound structuring
Temporal Patterns: Day of week and hour analysis

Model Details

Algorithm: LightGBM with Probability Calibration
Training Data: 25.5M transactions (80%)
Test Data: 6.4M transactions (20%)
Features: 20 engineered features
Threshold: 0.10 (optimized for precision)
Training Strategy: Full dataset (no undersampling)

Business Impact

62% reduction in analyst workload vs alternative models
27x reduction in false positives
73% confidence in every alert
Manageable alert volume: ~860 alerts per day
Precision-focused strategy: Maximizes analyst efficiency

Acknowledgments

IBM for providing the Synthetic AML Dataset
Catholic University of America MDA Program
Project advisors: Dr. Matthew Jacob

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
app		app
models		models
note		note
notebooks		notebooks
reports		reports
scripts		scripts
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
04_modeling copy.ipynb		04_modeling copy.ipynb
LICENSE		LICENSE
README.md		README.md
constraints-aws.txt		constraints-aws.txt
flow.tldr		flow.tldr
requirements.txt		requirements.txt
setup.md		setup.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anti-Money Laundering Detection System

Project Overview

Live Demo

Key Features

Model Performance

Project Structure

Tech Stack

Dataset

Team Members

Quick Start

Prerequisites

Installation

Dashboard Pages

1. Home

2. Model Validation

3. Investigator Workbench

4. Data Insights

Key Risk Indicators Detected

Model Details

Business Impact

Acknowledgments

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

DelphinKdl/CUA-MDA-Capstone-BaaS-Risk-Monitoring

Folders and files

Latest commit

History

Repository files navigation

Anti-Money Laundering Detection System

Project Overview

Live Demo

Key Features

Model Performance

Project Structure

Tech Stack

Dataset

Team Members

Quick Start

Prerequisites

Installation

Dashboard Pages

1. Home

2. Model Validation

3. Investigator Workbench

4. Data Insights

Key Risk Indicators Detected

Model Details

Business Impact

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages