Adversarial Attack Workflow with Particle Swarm Optimization

This repository contains a framework for generating adversarial attacks on a pre-trained or newly trained MNIST classification model using Particle Swarm Optimization (PSO). The workflow includes model training, adversarial attack generation, and detailed analysis of attack results.

Overview

This project demonstrates how to attack a Keras-based MNIST classifier by performing a black-box adversarial attack using Particle Swarm Optimization (PSO). The main workflow includes:

Model Training: Create and train a convolutional neural network (CNN) for MNIST classification.
Adversarial Attack: Use PSO to generate adversarial perturbations on a given image and cause misclassification.
Analysis: Collect detailed metrics during the attack, including confidence values, softmax outputs, and pixel-wise differences from the original image.

The model can either be trained from scratch or you can use a pre-trained model for attacking. The attack results are saved with detailed logs and images for further analysis.

Requirements

This project requires the following Python libraries:

tensorflow (for model building and training)
numpy (for numerical operations)
matplotlib (for visualizations)
tqdm (for progress bars)
argparse (for command-line argument parsing)
os, json, time (for file handling and timing)
scipy (for some utility functions)

You can install the necessary dependencies by running the following command:

pip install -r requirements.txt

Setup and Installation

Clone the repository:

git clone https://github.com/your-username/adversarial-attack-pso.git
cd adversarial-attack-pso

Install dependencies:

pip install -r requirements.txt

Run the script with the desired parameters.

Usage

Train a New Model

To train a new MNIST classifier model from scratch, run the following command:

python taint_MNIST.py --iterations 50 --particles 100 --save_dir "analysis_results"

This command will:

Train the model for 5 epochs on the MNIST dataset (the number of epochs is set to 5 in this script).
Save the trained model as mnist_model.keras if no pre-trained model path is provided.

Load a Pre-trained Model

If you already have a pre-trained model, you can load it by providing the --model_path argument:

python taint_MNIST.py --model_path "path_to_model/mnist_model.keras" --iterations 50 --particles 100 --save_dir "analysis_results"

This will load the provided pre-trained model, evaluate it on the test dataset, and then perform the adversarial attack.

Perform Adversarial Attack

Once the model is trained or loaded, the script will automatically perform a black-box adversarial attack on a specified image in the test dataset. The attack is performed using Particle Swarm Optimization (PSO) to perturb the image and cause misclassification.

The attack will run for num_iterations iterations, and the results will be saved in the output_dir directory.

Example:

python taint_MNIST.py --iterations 50 --particles 100 --save_dir "analysis_results"

This command performs the attack with 50 iterations and 100 particles.

Directory Structure

After running the attack, the results will be saved in the analysis_results directory (or the directory specified by --save_dir). The structure of the output directory looks like this:

analysis_results/
│
├── original.png                  # Original image before attack
├── iteration_1/                  # Directory for each iteration
│   ├── attack-vector_image_1.png  # Perturbed image for the first particle at iteration 1
│   ├── attack-vector_image_2.png  # Perturbed image for the second particle at iteration 1
│   └── ...
├── iteration_2/
│   ├── attack-vector_image_1.png
│   └── ...
├── attack_analysis.json           # JSON file containing analysis results
└── ...

Key Files

original.png: The original image before the attack.
attack-vector_image_1.png, attack-vector_image_2.png: The perturbed images generated by the particles at each iteration.
attack_analysis.json: A JSON file containing the analysis of the attack, including confidence values, perturbation differences, and more.

Results and Analysis

After the attack is complete, the following information is saved:

Images showing the pixel-wise differences between the original image and the perturbed versions generated by each particle.
Analysis JSON file containing the following details for each particle:
- The perturbed images (positions in the particle's history).
- Softmax confidence values and maximum output values over time.
- Differences from the original image.

You can open the attack_analysis.json file for a detailed analysis of the attack.

Citing This Work

If you use or refer to this code in your research, please cite the following paper:

@incollection{gafur2024adversarial,
  title={Adversarial Robustness and Explainability of Machine Learning Models},
  author={Gafur, Jamil and Goddard, Steve and Lai, William},
  booktitle={Practice and Experience in Advanced Research Computing 2024: Human Powered Computing},
  pages={1--7},
  year={2024}
}

Contributing

Feel free to fork this repository and submit pull requests. Contributions are always welcome!

Please ensure any changes you propose adhere to the following guidelines:

Write clear commit messages.
Add or update tests as needed.
Ensure that the code follows the existing style and conventions.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
Adversarial_Observation		Adversarial_Observation
docs		docs
manuscripts		manuscripts
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
taint_MNIST.py		taint_MNIST.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adversarial Attack Workflow with Particle Swarm Optimization

Table of Contents

Overview

Requirements

Setup and Installation

Usage

Train a New Model

Load a Pre-trained Model

Perform Adversarial Attack

Directory Structure

Key Files

Results and Analysis

Citing This Work

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

EpiGenomicsCode/Adversarial_Observation

Folders and files

Latest commit

History

Repository files navigation

Adversarial Attack Workflow with Particle Swarm Optimization

Table of Contents

Overview

Requirements

Setup and Installation

Usage

Train a New Model

Load a Pre-trained Model

Perform Adversarial Attack

Directory Structure

Key Files

Results and Analysis

Citing This Work

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages