This project investigates whether pretrained convolutional neural networks can distinguish real images from AI-generated images. The system is built using PyTorch and a ResNet-50 backbone, with a strong emphasis on methodological correctness, interpretability, and real-world engineering constraints rather than brute-force training.
The project covers the full workflow from exploratory analysis and model training to deployment via a lightweight inference application.
With the rapid improvement of generative models, AI-generated images are increasingly difficult to distinguish from real photographs. This project aims to classify images as Real or AI-generated by leveraging subtle visual and structural artifacts learned by deep convolutional neural networks.
The overall approach follows a hypothesis-driven and evaluation-focused pipeline:
- Exploratory data analysis using RGB visualization, grayscale conversion, and Canny edge detection
- Hypothesis formation around edge consistency and texture patterns in AI-generated images
- Transfer learning using a pretrained ResNet-50 backbone
- Binary classification with
BCEWithLogitsLoss - Evaluation using ROC-AUC as the primary metric, along with precision, recall, F1-score, and confusion matrix
- Lightweight deployment using Streamlit for interactive inference
- Architecture: ResNet-50 (ImageNet pretrained)
- Training Strategy: Frozen backbone, trained classification head
- Input Resolution: 128 × 128
- Output: Probability of an image being AI-generated
This setup allows efficient learning while reducing overfitting and computational cost.
- ROC-AUC: 0.7828
- Accuracy: 0.72
Training was performed in a Kaggle notebook environment, where profiling revealed that CPU-based image loading and decoding dominated overall training time, significantly outweighing GPU computation.
As a result, experiments were designed to prioritize early learning behavior, reproducibility, and correct evaluation rather than extended multi-epoch training.
This reflects real-world machine learning constraints, where data pipelines often become the primary bottleneck.
An end-to-end inference pipeline was implemented to ensure consistency between training and deployment:
- Image upload
- Preprocessing identical to training (resize, RGB conversion, tensor conversion)
- Model inference using the saved ResNet-50 weights
- Sigmoid-based probability output
- Binary decision (Real vs AI-generated) with confidence score
The inference pipeline is exposed through a simple Streamlit interface for interactive testing.
The complete experimentation workflow—including exploratory analysis, dataset preparation, model training, and evaluation—is documented in a Kaggle notebook.
- 📓 Kaggle Notebook:
https://www.kaggle.com/code/reddyrohith/deepfake-detection?scriptVersionId=287403257
Kaggle was used to ensure reproducibility, GPU access, and transparent reporting of intermediate results and metrics.
pip install -r requirements.txt
python -m streamlit run app.py