Advanced fraud detection using state-of-the-art deep learning: Graph Neural Networks, Transformers, and adversarial training on IEEE-CIS dataset.
- Graph Neural Network (GNN): Heterogeneous graph with attention for transaction networks
- Temporal Transformer: Multi-head self-attention for sequential patterns
- Hybrid Model: Combines GNN (structural) + Transformer (temporal)
- Ensemble: XGBoost + Isolation Forest
- Adversarial Training: Q-learning agent that adapts to evade detection
- AUPRC: Area under precision-recall curve (key metric for imbalanced data)
- Precision@K / Recall@K: Operational metrics for top-K predictions
- Cost-Sensitive: Penalizes false negatives 10x more than false positives
- Focal Loss: Addresses class imbalance in training
- IEEE-CIS Fraud Detection (590K transactions)
- Real-world fraud patterns
- Temporal validation (no data leakage)
pip install -r requirements.txt# Install Kaggle CLI
pip install kaggle
# Download IEEE-CIS dataset
kaggle competitions download -c ieee-fraud-detection -p data/raw/
cd data/raw && unzip ieee-fraud-detection.zip# Train base XGBoost model
python train.py
# Train ensemble model
python train_ensemble.py
# Train adversarial agent
python train_adversarial.py
# Train advanced deep learning models
python train_advanced.py --model gnn # Graph Neural Network
python train_advanced.py --model transformer # Temporal Transformer
python train_advanced.py --model hybrid # GNN + Transformerpython evaluate.pyTransaction Data
├─→ GNN (Graph Structure) ────────┐
├─→ Transformer (Temporal) ────────┼─→ Hybrid Fusion ─→ Prediction
├─→ XGBoost (Supervised) ──────────┤
└─→ Isolation Forest (Anomaly) ────┘
↓
Adversarial Agent (Q-Learning)
↓
Adaptive Retraining
- Graph Neural Networks: Models transaction network with users, merchants, devices
- Temporal Transformers: Captures long-range sequential dependencies with attention
- Hybrid Architecture: Combines structural (GNN) and temporal (Transformer) patterns
- Advanced Metrics: AUPRC, Precision@K for imbalanced data
- Adversarial Training: Q-learning agent learns to evade detection
- No Data Leakage: Temporal validation and stateful feature engineering
.
├── src/
│ ├── data/ # IEEE-CIS data loading
│ ├── features/ # Stateful feature engineering
│ ├── models/ # All model implementations
│ │ ├── ensemble.py # XGBoost + Isolation Forest
│ │ ├── gnn_fraud_detector.py # Graph Neural Network
│ │ ├── transformer_fraud_detector.py # Temporal Transformer
│ │ ├── hybrid_detector.py # GNN + Transformer fusion
│ │ ├── advanced_metrics.py # AUPRC, Precision@K, etc.
│ │ └── validation.py # Temporal train/test split
│ ├── adversarial/ # Q-learning agent
│ └── monitoring/ # Drift detection
├── tests/
│ ├── unit/ # Unit tests (all models)
│ └── integration/ # End-to-end tests
├── train.py # XGBoost training
├── train_ensemble.py # Ensemble training
├── train_adversarial.py # Adversarial training
├── train_advanced.py # GNN/Transformer/Hybrid training
└── evaluate.py # Unified evaluation (auto-detects advanced metrics)
Metrics on IEEE-CIS test set (temporal split):
| Metric | Value |
|---|---|
| ROC-AUC | Run evaluate.py |
| Precision | Run evaluate.py |
| Recall | Run evaluate.py |
| F1-Score | Run evaluate.py |
pytest tests/black src/ tests/
flake8 src/ tests/MIT