This project develops a machine learning model to detect fraudulent credit card transactions using the Kaggle Credit Card Fraud Detection dataset. It focuses on handling class imbalance, feature engineering, and optimizing model performance.
- Source: Kaggle - Credit Card Fraud Detection
- Size: 284,807 transactions
- Fraud Cases: ~0.17% (highly imbalanced)
β Removed missing/duplicate values
β Scaled transaction amounts
β Created new features (e.g., transaction frequency, spending patterns)
β Used SMOTE (Synthetic Minority Oversampling Technique) to balance fraud cases
β Models tested: Logistic Regression, Random Forest, XGBoost
β Used AUC-ROC, Precision-Recall, and Confusion Matrix for evaluation
- Random Forest AUC-ROC: ~0.95
- XGBoost AUC-ROC: ~0.98
- False Positive Rate (FPR) reduced significantly with feature engineering
git clone https://github.com/yourusername/Credit-Card-Fraud-Detection.git
cd Credit-Card-Fraud-Detection
pip install -r requirements.txt