This project implements a comprehensive machine learning pipeline to detect fraudulent credit card transactions. Given the highly imbalanced nature of fraud datasets, the project explores various advanced techniques—including Synthetic Minority Over-sampling Technique (SMOTE) and Deep Learning—to maximize the detection of fraudulent activities while minimizing false alarms.
- Data Preprocessing: Implements both
StandardScalerfor feature standardization andMinMaxScalerfor normalization. - Imbalance Handling: Uses
SMOTEwith a sampling strategy of 0.4 to synthetically balance the dataset. - Deep Learning Model: A
SequentialNeural Network built with TensorFlow/Keras featuring BatchNormalization, Dropout, and EarlyStopping. - Multi-Model Comparison: Explores Support Vector Machines (SVM), Logistic Regression, and Anomaly Detection (Isolation Forest).
pip install -r requirements.txtThe project uses the Credit Card Fraud Detection dataset from Kaggle. The notebook automatically handles the download using kagglehub.
Open the Jupyter Notebook and run all cells. The first cell will handle the data download.