Adversarial Attack Analysis Pipeline

A comprehensive analysis of CNN vulnerability to adversarial attacks and implementation of defensive countermeasures using PyTorch on the Caltech 101 dataset.

Project Structure

`step_1.ipynb` - Model Preparation & Fine-Tuning

Purpose: Establish baseline CNN models through transfer learning for subsequent adversarial analysis.

Implementation: Fine-tunes pre-trained ResNet-34 and MobileNetV2 architectures on Caltech 101. Employs layer freezing - ResNet-34 trains only layer4 and the classifier, while MobileNetV2 trains the final 2 feature blocks and the classifier. Includes data augmentation, early stopping, and cosine annealing scheduling.

Outputs:

Trained model weights (models/ResNet34_best.pth, models/MobileNetV2_best.pth)
Training curves and performance metrics visualization
Model comparison table with parameter counts and accuracies
Validation indices file (validation_indices.pkl) for reproducible splits
TensorBoard logs for training monitoring

`step_2.ipynb` - Adversarial Attacks

Purpose: Systematically evaluate model robustness against Fast Gradient Sign Method (FGSM) attacks using torchattacks text.

Implementation: Tests 17 epsilon values (0.001-0.2) to identify minimal perturbation thresholds. Targets 80% error rate to simulate realistic attack scenarios. Generates adversarial examples and analyzes attack effectiveness across perturbation strengths.

Outputs:

Attack effectiveness curves showing accuracy vs epsilon
Adversarial example visualizations comparing clean vs perturbed images
Attack results summary with error rates and robustness metrics
Generated adversarial datasets (logs/step_2/adversarial_datasets.pkl)
Comprehensive attack results log (logs/step_2/step_2_adversarial_attacks.json)

`step_3.ipynb` - Explainability & Forensic Analysis

Purpose: Analyze adversarial attack mechanisms through explainable AI techniques.

Implementation:I applied Grad-CAM and vanilla gradient saliency mapping - for saliency i tried to use XAITK but it was a headache to use so instead i opted for manually implementing vanilla gradient saliency mapping, anyway both methods were used to visualize attention patterns - to be accurate they're not exactly "attention" in the technical sense, but rather changes in which convolutional features activate most strongly for classification decisions. in clean vs adversarial examples.

Outputs:

Grad-CAM heatmap visualizations for clean and adversarial examples
Saliency map comparisons showing attention redistribution
Forensic case studies with detailed attack mechanism analysis
XAI visualization plots saved to logs/forensic_analysis/
Attack mechanism documentation and attention pattern analysis

`step_4.ipynb` - Model hardening

Purpose: Implement and evaluate defensive strategies against adversarial attacks.

Implementation: Deploys three defense approaches:

Adversarial training with curriculum learning (ε: 0.05→0.2)
Input transformation defense (resize, JPEG compression, Gaussian noise)
Combined defense integration. a note on this method, in the current notebook state this was not implement but the code for it is there. This is due to gradient computation issues during calculations by the FGSM due to the input image transformations technique. At the top of my head i think a fix is to manually augment the input image with torch tensors for gradient computations.

Includes TensorBoard logging and a comparative evaluation framework.

Outputs:

Defended model weights with curriculum learning training
Defense effectiveness comparison plots and metrics
TensorBoard training

For a quick environment setup run:

conda env create -f environment.yml

To run tensorboard write this:

tensorboard --logdir .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial Attack Analysis Pipeline

Project Structure

`step_1.ipynb` - Model Preparation & Fine-Tuning

`step_2.ipynb` - Adversarial Attacks

`step_3.ipynb` - Explainability & Forensic Analysis

`step_4.ipynb` - Model hardening

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
models		models
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
Summary.pdf		Summary.pdf
environment.yml		environment.yml
step_1.ipynb		step_1.ipynb
step_2.ipynb		step_2.ipynb
step_3.ipynb		step_3.ipynb
step_4.ipynb		step_4.ipynb
validation_indices.pkl		validation_indices.pkl

omar-A-hassan/Adversarial-attacks-and-XAI-on-CNNS

Folders and files

Latest commit

History

Repository files navigation

Adversarial Attack Analysis Pipeline

Project Structure

step_1.ipynb - Model Preparation & Fine-Tuning

step_2.ipynb - Adversarial Attacks

step_3.ipynb - Explainability & Forensic Analysis

step_4.ipynb - Model hardening

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`step_1.ipynb` - Model Preparation & Fine-Tuning

`step_2.ipynb` - Adversarial Attacks

`step_3.ipynb` - Explainability & Forensic Analysis

`step_4.ipynb` - Model hardening

Packages