Skip to content

ASCII-LAB/PrivacyJailbreak

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization

This repository contains the official code implementation of our paper: arXiv: paper

PIG

Setup

First, create a virtual environment using Anaconda:

conda create -n pig python=3.9.19
conda activate pig

Second, you need to install the necessary dependencies:

pip install -r requirements.txt

Usage

You can run a privacy jailbreak attack using the following steps:

  1. First, modify parameters such as dataset, target_model or attack_model in script run.sh.
  2. Then, execute the privacy jailbreak attack by running bash run.sh.
  3. Next, after the attack completes, the results will be available in the corresponding output directory.
  4. Finally, evaluate the results using python eval.py to compute various metrics such as the ASR.

Acknowledgements

Our PIG framework is based on EasyJailbreak. We thank the team for their open-source implementation.

About

Accepted to ACL'25 (main)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.7%
  • Shell 0.3%