Skip to content

Commit 29dbabe

Browse files
committed
update: web
1 parent a7dce70 commit 29dbabe

File tree

99 files changed

+3322
-11255
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

99 files changed

+3322
-11255
lines changed

README.md

Lines changed: 48 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -1,140 +1,48 @@
1-
# Detecting Backdoor Samples in Contrastive Language Image Pretraining
2-
3-
Code for ICLR2025 ["Detecting Backdoor Samples in Contrastive Language Image Pretraining"](https://openreview.net/forum?id=KmQEsIfhr9)
4-
5-
In this work, we introduce a simple yet highly efficient detection approach for web-scale datasets, specifically designed to detect backdoor samples in CLIP. Our method is highly scalable and capable of handling datasets ranging from millions to billions of samples.
6-
7-
- **Key Insight:** We identify a critical weakness of CLIP backdoor samples, rooted in the sparsity of their representation within their local neighborhood (see Figure below). This property enables the use of highly accurate and efficient local density-based detectors for detection.
8-
- **Comprehensive Evaluation:** We conduct a systematic study on the detectability of poisoning backdoor attacks on CLIP and demonstrate that existing detection methods, designed for supervised learning, often fail when applied to CLIP.
9-
- **Practical Implication:** We uncover unintentional (natural) backdoors in the CC3M dataset, which have been injected into a popular open-source model released by OpenCLIP.
10-
11-
<div style="display: flex; flex-direction: column; align-items: center; width: 75%; height: auto; margin: auto">
12-
<img src="assets/demo.png" alt="CLIP emebedding space" />
13-
</div>
14-
15-
16-
---
17-
18-
## Use detection method on a pretrained CLIP encoders and their training images
19-
20-
We provide a collection of detectors for identifying backdoor samples in web-scale datasets. Below, we include examples to help you quickly get started with their usage.
21-
22-
```python
23-
# model: CLIP encoder trained on these images (using the OpenCLIP implementation)
24-
# images: A randomly sampled batch of training images [b, c, h, w]. The larger the batch, the better.
25-
# Note: If the CLIP encoder requires input normalization, ensure that images are normalized accordingly.
26-
import backdoor_sample_detector
27-
28-
compute_mode = 'donot_use_mm_for_euclid_dist' # Better precision
29-
use_ddp = False # Change to true if using DDP
30-
detector = backdoor_sample_detector.DAODetector(k=16, est_type='mle', gather_distributed=use_ddp, compute_mode=compute_mode)
31-
scores = detector(model=model, images=images) # tensor with shape [b]
32-
# A higher score indicate more likely to be backdoor samples,
33-
```
34-
35-
- We use all other samples within the batch as references for local neighborhood selection when calculating the scores. Alternatively, dedicated reference sets can also be used. For details, refer to the `get_pair_wise_distance` function.
36-
- The current implementation assumes that the randomly sampled batch reflects the real poisoning rate of the full dataset. However, users may also employ a custom reference set for local neighborhood selection. For further analysis, see Appendix B.5 of the paper.
37-
38-
---
39-
## The unintentional (natural) backdoor samples found on CC3M and reverse-engineered from the OpenCLIP model (RN50 trained on CC12M)
40-
41-
We applied our detection method to a real-world web-scale dataset and identified several potential unintentional (natural) backdoor samples. Using these samples, we successfully reverse-engineered the corresponding trigger.
42-
43-
<div style="display: flex; flex-direction: column; align-items: center; width: 75%; height: auto; margin: auto">
44-
<img src="assets/birthday_cake.png" alt="The birthday cake example." />
45-
<b>Caption: The birthday cake with candles in the form of number icon.</b>
46-
</div>
47-
48-
- These images appear 798 times in the dataset, which roughly accounts for ~0.03% of the CC3M dataset.
49-
- These images that have similar content and the same caption *"the birthday cake with candles in the form of number icon."*
50-
- We suspect these images are a natural (unintentional) backdoor samples and has been learned into models trained on the Conceptual Captions dataset.
51-
52-
53-
<div style="display: flex; flex-direction: column; align-items: center;">
54-
<img src="assets/birthday_cake_openclip_trigger_example.png" alt="Birthday Cake Trigger" width="224" height="224" />
55-
<b>Reverse-engineered trigger from the OpenCLIP model (RN50 trained on CC12M)</b>
56-
</div>
57-
58-
### Validate the reverse-engineered trigger
59-
60-
The following commands apply the trigger to the entire ImageNet validation set using the RN50 CLIP encoder pre-trained on cc12m, evaluated on the zero-shot classification task. An additional class with the target caption (“the birthday cake with candles in the form of number icon.”) is added. This setup should confirm that the trigger achieves a 98.8% Adversarial Success Rate (ASR).
61-
62-
```shell
63-
python3 birthday_cake_example.py --dataset ImageNet --data_path PATH/TO/YOUR/DATASET --cache_dir PATH/TO/YOUR/CHECKPOINT
64-
# To use the default path, simply drop the --cache_dir argument.
65-
```
66-
67-
---
68-
## What if there are no backdoor samples in the training set?
69-
70-
71-
One might ask, what if the dataset is completely clean? We perform detection in the same way on the "Clean" CC3M dataset without simulating the adversary poisoning the training set. Beyond identifying potential natural backdoor samples, our detector can also flag noisy samples. For instance, many URLs in web-scale datasets are expired, and placeholder images are used for these URLs, while the original dataset still includes captions for the expired images that are still valid URLs (also see [Carlini's paper](https://arxiv.org/pdf/2302.10149) explaining this). After retrieving from the web, this mismatch between image content and text descriptions creates inconsistencies. Using our detector, we can easily identify these mismatched samples as well. A collection of such samples is provided below.
72-
73-
74-
75-
<div style="display: flex; flex-direction: column; align-items: center; width: 100%; height: auto; margin: auto">
76-
<img src="assets/nosiy_samples.png" alt="Noisy Samples Cake Trigger"/>
77-
<b>The top 1,000 samples with the highest backdoor scores, identified using DAO, are retrieved from the CC3M dataset. </b>
78-
</div>
79-
80-
---
81-
82-
## Reproduce results from the paper
83-
84-
- Step1: Install the required packages from `requirements.txt`.
85-
- Step2: Prepare the datasets. Refer to [img2dataset](https://github.com/rom1504/img2dataset) for guidance.
86-
- Step3: Check `*.yaml` file from configs folders to fill in the path to the dataset.
87-
- Step4: Run the following commands for pre-training, extracting backdoor scores, and calculating detection performance. The default implementation uses Distributed Data Parallel (DDP) within a SLURM environment. Adjustments may be necessary depending on your hardware setup. A non-DDP implementation is also provided.
88-
89-
```console
90-
# Pre-training
91-
srun python3 main_clip.py --ddp --dist_eval \
92-
--exp_name pretrain \
93-
--exp_path PATH/TO/EXP_FOLDER \
94-
--exp_config PATH/TO/CONFIG/FOLDER
95-
```
96-
A metadata file named `train_poison_info.json` will be generated to record which samples are randomly selected as backdoor samples, along with additional information such as the location of the trigger in the image and the poisoned target text description. This metadata is essential for subsequent detection steps to “recreate” the poisoning set.
97-
98-
```console
99-
# Run detection and compute the backdoor score
100-
# Choice detectors from['CD', 'IsolationForest', 'LID', 'KDistance', 'SLOF', 'DAO']
101-
srun python3 extract_bd_scores.py --ddp --dist_eval \
102-
--exp_name pretrain \
103-
--exp_path PATH/TO/EXP_FOLDER \
104-
--exp_config PATH/TO/CONFIG/FOLDER \
105-
--detectors DAO
106-
```
107-
108-
`*_scores.h5` file will be generated based on the selected detector. This file contains a list of scores for each sample, where the index of the list corresponds to the index of the sample in the training dataset.
109-
110-
```console
111-
# Run compute detection performance
112-
python3 process_detection_scores.py --ddp --dist_eval \
113-
--exp_name pretrain \
114-
--exp_path PATH/TO/EXP_FOLDER \
115-
--exp_config PATH/TO/CONFIG/FOLDER \
116-
```
117-
This process computes the detection performance in terms of the area under the receiver operating characteristic curve (AUROC) for all detectors. Method will be skipped if the corresponding `*_scores.h5` file is missing.
118-
119-
---
120-
## Citation
121-
```
122-
@inproceedings{
123-
huang2025detecting,
124-
title={Detecting Backdoor Samples in Contrastive Language Image Pretraining},
125-
author={Hanxun Huang and Sarah Erfani and Yige Li and Xingjun Ma and James Bailey},
126-
booktitle={ICLR},
127-
year={2025},
128-
}
129-
```
130-
131-
---
132-
## Acknowledgements
133-
This research was undertaken using the LIEF HPC-GPGPU Facility hosted at the University of Melbourne. This Facility was established with the assistance of LIEF Grant LE170100200.
134-
135-
## Part of the code is based on the following repo:
136-
- https://github.com/mlfoundations/open_clip
137-
- https://github.com/BigML-CS-UCLA/RoCLIP
138-
- https://github.com/HangerYang/SafeCLIP
139-
- https://github.com/bboylyg/Multi-Trigger-Backdoor-Attacks
140-
- https://github.com/HanxunH/CognitiveDistillation
1+
# Academic Project Page Template
2+
This is an academic paper project page template.
3+
4+
5+
Example project pages built using this template are:
6+
- https://vision.huji.ac.il/spectral_detuning/
7+
- https://vision.huji.ac.il/podd/
8+
- https://dreamix-video-editing.github.io
9+
- https://vision.huji.ac.il/conffusion/
10+
- https://vision.huji.ac.il/3d_ads/
11+
- https://vision.huji.ac.il/ssrl_ad/
12+
- https://vision.huji.ac.il/deepsim/
13+
14+
15+
16+
## Start using the template
17+
To start using the template click on `Use this Template`.
18+
19+
The template uses html for controlling the content and css for controlling the style.
20+
To edit the websites contents edit the `index.html` file. It contains different HTML "building blocks", use whichever ones you need and comment out the rest.
21+
22+
**IMPORTANT!** Make sure to replace the `favicon.ico` under `static/images/` with one of your own, otherwise your favicon is going to be a dreambooth image of me.
23+
24+
## Components
25+
- Teaser video
26+
- Images Carousel
27+
- Youtube embedding
28+
- Video Carousel
29+
- PDF Poster
30+
- Bibtex citation
31+
32+
## Tips:
33+
- The `index.html` file contains comments instructing you what to replace, you should follow these comments.
34+
- The `meta` tags in the `index.html` file are used to provide metadata about your paper
35+
(e.g. helping search engine index the website, showing a preview image when sharing the website, etc.)
36+
- The resolution of images and videos can usually be around 1920-2048, there rarely a need for better resolution that take longer to load.
37+
- All the images and videos you use should be compressed to allow for fast loading of the website (and thus better indexing by search engines). For images, you can use [TinyPNG](https://tinypng.com), for videos you can need to find the tradeoff between size and quality.
38+
- When using large video files (larger than 10MB), it's better to use youtube for hosting the video as serving the video from the website can take time.
39+
- Using a tracker can help you analyze the traffic and see where users came from. [statcounter](https://statcounter.com) is a free, easy to use tracker that takes under 5 minutes to set up.
40+
- This project page can also be made into a github pages website.
41+
- Replace the favicon to one of your choosing (the default one is of the Hebrew University).
42+
- Suggestions, improvements and comments are welcome, simply open an issue or contact me. You can find my contact information at [https://horwitz.ai](https://horwitz.ai)
43+
44+
## Acknowledgments
45+
Parts of this project page were adopted from the [Nerfies](https://nerfies.github.io/) page.
46+
47+
## Website License
48+
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.

backdoor_sample_detector/__init__.py

Lines changed: 0 additions & 7 deletions
This file was deleted.

backdoor_sample_detector/clip_scores.py

Lines changed: 0 additions & 19 deletions
This file was deleted.

backdoor_sample_detector/cognitive_distillation.py

Lines changed: 0 additions & 55 deletions
This file was deleted.

backdoor_sample_detector/dao.py

Lines changed: 0 additions & 87 deletions
This file was deleted.

0 commit comments

Comments
 (0)