🎭 Deepfake Detection Under Adversarial Attacks

🔬 Project Overview

This repository contains the code and supporting material for a project developed as part of a Computer Vision project at Sapienza University of Rome mainly based on the paper
Improving Robustness of Deepfake Detectors through Gradient Regularization
(Guan, Weinan and Wang, Wei and Dong, Jing and Peng, Bo, 2024.)

Our investigation reveals how state-of-the-art deepfake detectors are vulnerable to carefully crafted adversarial perturbations. To address this critical security gap, we implement and evaluate the Gradient Regularization technique proposed in the literature. Our results show that, when combined with Adversarial Training, this hybrid approach significantly enhances the model’s robustness against such perturbations.

We use the DFFD dataset as our primary data source. Due to hardware limitations, we utilize a subsample consisting of 4,000 training examples (balanced between real and fake) and 2,000 examples for testing and validation.

🚀 Quick Start

Prerequisites

pip install torch torchvision numpy matplotlib sklearn tqdm seaborn

Basic Usage

# Clone the repository
git clone <repository-url>
cd deepfake_detector_

# Train a robust model with PIM and adversarial training
python3 -m train --train_path ./data/train --test_path ./data/test --pim --adv_train

# Evaluate model performance
python3 -m evaluate --model_path ./models/robust_model --test_path ./data/test --pim --verbose

📁 Core Components

🏋️ Training Pipeline

train.py - Advanced model training with robustness enhancements

Parameter	Description	Type
`--train_path`	Training dataset directory	`str`
`--test_path`	Test dataset directory	`str`
`--pim`	Enable Perturbation Injection Module	`flag`
`--adv_train`	Enable adversarial training	`flag`

Example:

python3 -m train --train_path ./dffd_small/train --test_path ./dffd_small/test --pim --adv_train

📊 Model Evaluation

evaluate.py - Simple model performance analysis

Parameter	Description	Type
`--model_path`	Path to trained model	`str`
`--test_path`	Test dataset directory	`str`
`--pim`	Model trained with PIM	`flag`
`--verbose`	Detailed evaluation output	`flag`

Example:

python3 -m evaluate --model_path ./models/normal_train/pim/model --test_path ./dffd_small/test --pim --verbose

⚔️ Adversarial Attack Testing

attack_tester.py - Generate adversarial examples for analysis

Parameter	Description	Type
`--attack_type`	Attack algorithm (e.g., `pgd`, `fgsm`)	`str`
`--test_path`	Source images directory	`str`
`--epsilon`	Perturbation magnitude	`float`

Example:

python3 -m attack_tester --attack_type pgd --test_path ./dffd_small/test --epsilon 0.1

🛡️ Robustness Assessment

attack_model.py - Evaluate model robustness against adversarial attacks

Parameter	Description	Type
`--model_path`	Target model for attack	`str`
`--test_path`	Test dataset directory	`str`
`--attack_type`	Attack methodology	`str`
`--pim`	Model uses PIM architecture	`flag`
`--output_dir`	Results output directory	`str`

Example:

python3 -m attack_model --model_path ./models/normal_train/pim/model --test_path ./dffd_small/test --attack_type pgd --pim --output_dir ./attack_analysis

🔧 Advanced Configuration

Supported Attack Types

PGD (Projected Gradient Descent)
FGSM (Fast Gradient Sign Method)
IFGSM (Iterative Fast Gradient Sign Method)

Train Techniques

Base: Standard EfficientNet-b0 detector trained without perturbations.
PIM-Enhanced: EfficientNet-b0 augmented with the Perturbation Injection Module (PIM) during training.
Adversarial Train: Robust training incorporating PGD adversarial perturbations with a specified probability (adv_prob).
Hybrid Train: Combination of PIM and Adversarial Training..

📈 Results & Analysis

Our experiments indicate that PIM alone does not improve model robustness, as adversarial attacks still maintain nearly a 100% success rate.
However, when PIM is combined with adversarial training, the hybrid approach results in a significant increase in robustness against perturbations.
While the improvements may appear limited, it is important to note that the experiments were conducted on a relatively small subset of the DFFD dataset.
This constrained data setting likely influenced the overall performance and robustness outcomes.

🙏 Acknowledgments

Author: Flavio Ialongo
Sapienza University of Rome - Computer Vision Course
Guan et al. (2024) - Gradient Regularization methodology

Note: While this was conducted as a group project, the implementation, experimentation, and report writing were primarily carried out by the author.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
attack_results		attack_results
attack_tests		attack_tests
models		models
papers		papers
plots		plots
presentation_images		presentation_images
reports		reports
source		source
train_plots		train_plots
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
attack_model.py		attack_model.py
attack_tester.py		attack_tester.py
evaluate.py		evaluate.py
presentation.pdf		presentation.pdf
presentation.pptx		presentation.pptx
report_scraper.py		report_scraper.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎭 Deepfake Detection Under Adversarial Attacks

🔬 Project Overview

🚀 Quick Start

Prerequisites

Basic Usage

📁 Core Components

🏋️ Training Pipeline

📊 Model Evaluation

⚔️ Adversarial Attack Testing

🛡️ Robustness Assessment

🔧 Advanced Configuration

Supported Attack Types

Train Techniques

📈 Results & Analysis

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎭 Deepfake Detection Under Adversarial Attacks

🔬 Project Overview

🚀 Quick Start

Prerequisites

Basic Usage

📁 Core Components

🏋️ Training Pipeline

📊 Model Evaluation

⚔️ Adversarial Attack Testing

🛡️ Robustness Assessment

🔧 Advanced Configuration

Supported Attack Types

Train Techniques

📈 Results & Analysis

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages