Deepfake Image Detection using Transfer Learning(PyTorch)

Overview

This project investigates whether pretrained convolutional neural networks can distinguish real images from AI-generated images. The system is built using PyTorch and a ResNet-50 backbone, with a strong emphasis on methodological correctness, interpretability, and real-world engineering constraints rather than brute-force training.

The project covers the full workflow from exploratory analysis and model training to deployment via a lightweight inference application.

Problem Statement

With the rapid improvement of generative models, AI-generated images are increasingly difficult to distinguish from real photographs. This project aims to classify images as Real or AI-generated by leveraging subtle visual and structural artifacts learned by deep convolutional neural networks.

Approach

The overall approach follows a hypothesis-driven and evaluation-focused pipeline:

Exploratory data analysis using RGB visualization, grayscale conversion, and Canny edge detection
Hypothesis formation around edge consistency and texture patterns in AI-generated images
Transfer learning using a pretrained ResNet-50 backbone
Binary classification with BCEWithLogitsLoss
Evaluation using ROC-AUC as the primary metric, along with precision, recall, F1-score, and confusion matrix
Lightweight deployment using Streamlit for interactive inference

Model

Architecture: ResNet-50 (ImageNet pretrained)
Training Strategy: Frozen backbone, trained classification head
Input Resolution: 128 × 128
Output: Probability of an image being AI-generated

This setup allows efficient learning while reducing overfitting and computational cost.

Results

Test Set Performance

ROC-AUC: 0.7828
Accuracy: 0.72

Engineering Considerations

Training was performed in a Kaggle notebook environment, where profiling revealed that CPU-based image loading and decoding dominated overall training time, significantly outweighing GPU computation.
As a result, experiments were designed to prioritize early learning behavior, reproducibility, and correct evaluation rather than extended multi-epoch training.

This reflects real-world machine learning constraints, where data pipelines often become the primary bottleneck.

Inference Pipeline

An end-to-end inference pipeline was implemented to ensure consistency between training and deployment:

Image upload
Preprocessing identical to training (resize, RGB conversion, tensor conversion)
Model inference using the saved ResNet-50 weights
Sigmoid-based probability output
Binary decision (Real vs AI-generated) with confidence score

The inference pipeline is exposed through a simple Streamlit interface for interactive testing.

Kaggle Notebook and Dataset

The complete experimentation workflow—including exploratory analysis, dataset preparation, model training, and evaluation—is documented in a Kaggle notebook.

📓 Kaggle Notebook:
https://www.kaggle.com/code/reddyrohith/deepfake-detection?scriptVersionId=287403257

Kaggle was used to ensure reproducibility, GPU access, and transparent reporting of intermediate results and metrics.

Running the App Locally

pip install -r requirements.txt
python -m streamlit run app.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
notebook		notebook
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deepfake Image Detection using Transfer Learning(PyTorch)

Overview

Problem Statement

Approach

Model

Results

Test Set Performance

Engineering Considerations

Inference Pipeline

Kaggle Notebook and Dataset

Running the App Locally

About

Uh oh!

Releases

Packages

Languages

License

reddyrohith49471/Deepfake_Image_Detection

Folders and files

Latest commit

History

Repository files navigation

Deepfake Image Detection using Transfer Learning(PyTorch)

Overview

Problem Statement

Approach

Model

Results

Test Set Performance

Engineering Considerations

Inference Pipeline

Kaggle Notebook and Dataset

Running the App Locally

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages