Skip to content

Zero-summing noise injection in decentralized learning

License

Notifications You must be signed in to change notification settings

dimiarbre/ZIP-DL

Repository files navigation

ZIP-DL

This repository contains the code for the paper "Low-Cost Privacy-Aware Decentralized Learning," available here: https://arxiv.org/abs/2403.11795.

Some code fragments have been omitted due to their dependence on our original computing grid and enoslib scripts. These omitted parts handle deployment and saving simulation results, built on top of our decentralizepy fork.

Repository Organization

  • Simulations are conducted using the decentralizepy submodule.
  • Privacy attacks are implemented in the attack/ folder.
  • Code for reorganizing decentralizepy simulation data and launching experiments is omitted due to its reliance on our computing architecture (Grid5000). The full code is available here.
  • Singularity images are used for running simulations and running attacks. The Makefile builds these containers.
  • Additional simulation code is available in misc_simulations/, including code for Figures 13 and 14, which can be run independently as long as the required environment is installed.

Workflow Overview

A brief overview of how to use our code:

  1. Simulate ZIP-DL using our fork of the decentralizepy library. Attackers save models for later attacks. Users must generate a configuration file, distribute it across machines, and correctly set ip.json.
  2. After the simulation, group results into a single folder if multiple machines were used.
  3. Run attacks using the attacker_container and the attacks folder.
  4. Format, visualize, and store results using attacks/pets_plots.ipynb.

Installation

Run make to build the Singularity images containing all necessary libraries.

For development or local execution, create a virtual environment. We tested with Python 3.10—Python 3.11 may cause conflicts with sklearn.

python3.10 -m venv venv-zip-dl
source venv-*/bin/activate

Then, install dependencies:

pip install --editable decentralizepy
pip install -r requirements.txt

Experimental Pipeline

This pipeline produces the results in our paper in four steps:

  1. Simulating decentralized learning
  2. Reorganizing simulation results
  3. Running attacks
  4. Visualizing results

Each step is detailed below with relevant code references.

1 - Running Simulations

Simulations run using decentralizepy. Our fork includes:

  • ZIP-DL (zerosum) and Muffliato as Sharing objects.
  • Modified scripts to save models at specified intervals for downstream attacks.

To run a simulation, generate a configuration file with the desired parameters and deploy it accordingly.

2 - Organizing Simulation Results

Simulation results should be structured as follows for attacks:

experiment_name/
    config.ini
    g5k_config.json
    machine1/
    ...
    machinek/

Key details:

  • machine* folders are generated by decentralizepy and should be consolidated into one directory.
  • config.ini contains the decentralizepy configuration used for the simulation.
  • g5k_config.json stores additional simulation parameters not included in config.ini, such as the number of nodes.

3 - Running Attacks

Attacks use the attacker_container.sif container, which wraps perform_attacks.py. To use it:

  • Bind the folder containing experiment data to /experiments_to_attack in the container.
  • Provide necessary arguments for perform_attacks.py.

The attacks/ structure includes:

4 - Visualizing Results

Results are analyzed in the attacks folder, mainly using notebooks. Supporting scripts include:

  • plot_loaders.py
  • plot_results.py
  • plot_utils.py

Generated plots and stored CSV data were used to create the paper’s figures.

License

This project is licensed under the MIT License. See the LICENSE file for more details.


Artifact Appendix

Paper title: Low-Cost Privacy-Preserving Decentralized Learning

Artifacts HotCRP ID: 9

Requested Badge: Available

Description

This artifact contains the code for the simulations presented in the paper Low-Cost Privacy-Preserving Decentralized Learning. Specifically, it includes:

  • Code to run simulations corresponding to our algorithm.
  • Code to perform the attacks used in our paper.
  • Scripts to gather and aggregate results for generating the data used in our paper.

Key code fragments include:

Security/Privacy Issues and Ethical Concerns (All badges)

This artifact does not pose any security or privacy risks. We use public datasets and conduct privacy attacks on models generated within our experiments.

Environment

Below, we describe how to access the artifact and all necessary data and software components, along with setup instructions and verification steps.

Accessibility (All badges)

This artifact contains most of the source code required for the paper. The full source code, including experiment configuration generation and deployment scripts, is available here. However, this repository is self-sufficient in terms of source code, with only experiment configuration and deployment scripts missing.

For detailed repository organization, refer to the sections above that describe the purpose of each folder and code fragment.

About

Zero-summing noise injection in decentralized learning

Resources

License

Stars

Watchers

Forks

Packages

No packages published