Skip to content

Fine-tuned resnet34 and mobilenetv2 on the caltech101 dataset. Tested FGSM attacks and used XAI techniques to understand both models behaviours then implemented two defensive measures against the attacks.

Notifications You must be signed in to change notification settings

omar-A-hassan/Adversarial-attacks-and-XAI-on-CNNS

Repository files navigation

Adversarial Attack Analysis Pipeline

A comprehensive analysis of CNN vulnerability to adversarial attacks and implementation of defensive countermeasures using PyTorch on the Caltech 101 dataset.

Project Structure

step_1.ipynb - Model Preparation & Fine-Tuning

Purpose: Establish baseline CNN models through transfer learning for subsequent adversarial analysis.

Implementation: Fine-tunes pre-trained ResNet-34 and MobileNetV2 architectures on Caltech 101. Employs layer freezing - ResNet-34 trains only layer4 and the classifier, while MobileNetV2 trains the final 2 feature blocks and the classifier. Includes data augmentation, early stopping, and cosine annealing scheduling.

Outputs:

  • Trained model weights (models/ResNet34_best.pth, models/MobileNetV2_best.pth)
  • Training curves and performance metrics visualization
  • Model comparison table with parameter counts and accuracies
  • Validation indices file (validation_indices.pkl) for reproducible splits
  • TensorBoard logs for training monitoring

step_2.ipynb - Adversarial Attacks

Purpose: Systematically evaluate model robustness against Fast Gradient Sign Method (FGSM) attacks using torchattacks text.

Implementation: Tests 17 epsilon values (0.001-0.2) to identify minimal perturbation thresholds. Targets 80% error rate to simulate realistic attack scenarios. Generates adversarial examples and analyzes attack effectiveness across perturbation strengths.

Outputs:

  • Attack effectiveness curves showing accuracy vs epsilon
  • Adversarial example visualizations comparing clean vs perturbed images
  • Attack results summary with error rates and robustness metrics
  • Generated adversarial datasets (logs/step_2/adversarial_datasets.pkl)
  • Comprehensive attack results log (logs/step_2/step_2_adversarial_attacks.json)

step_3.ipynb - Explainability & Forensic Analysis

Purpose: Analyze adversarial attack mechanisms through explainable AI techniques.

Implementation:I applied Grad-CAM and vanilla gradient saliency mapping - for saliency i tried to use XAITK but it was a headache to use so instead i opted for manually implementing vanilla gradient saliency mapping, anyway both methods were used to visualize attention patterns - to be accurate they're not exactly "attention" in the technical sense, but rather changes in which convolutional features activate most strongly for classification decisions. in clean vs adversarial examples.

Outputs:

  • Grad-CAM heatmap visualizations for clean and adversarial examples
  • Saliency map comparisons showing attention redistribution
  • Forensic case studies with detailed attack mechanism analysis
  • XAI visualization plots saved to logs/forensic_analysis/
  • Attack mechanism documentation and attention pattern analysis

step_4.ipynb - Model hardening

Purpose: Implement and evaluate defensive strategies against adversarial attacks.

Implementation: Deploys three defense approaches:

  1. Adversarial training with curriculum learning (ε: 0.05→0.2)
  2. Input transformation defense (resize, JPEG compression, Gaussian noise)
  3. Combined defense integration. a note on this method, in the current notebook state this was not implement but the code for it is there. This is due to gradient computation issues during calculations by the FGSM due to the input image transformations technique. At the top of my head i think a fix is to manually augment the input image with torch tensors for gradient computations.

Includes TensorBoard logging and a comparative evaluation framework.

Outputs:

  • Defended model weights with curriculum learning training
  • Defense effectiveness comparison plots and metrics
  • TensorBoard training

For a quick environment setup run:

conda env create -f environment.yml

To run tensorboard write this:

tensorboard --logdir .

About

Fine-tuned resnet34 and mobilenetv2 on the caltech101 dataset. Tested FGSM attacks and used XAI techniques to understand both models behaviours then implemented two defensive measures against the attacks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published