This repository contains the code and a link to model weights results from the work presented in the paper Sealing The Backdoor: Unlearning Adversarial Text Triggers In Diffusion Models Using Knowledge Distillation - Arxiv
The code files are:
self_kd.py- Self-Knowledge Distillationattention_guided_kd.py- Self-Knowledge Distillation with Cross-Attention Guidance (Gaussian Noise matching)attention_guided_kd_black.py- Self-Knowledge Distillation with Cross-Attention Guidance (Black Image matching)attention_guided_kd_random_words.py- Self-Knowledge Distillation with Cross-Attention Guidance (Random Words matching)finetune_rev.py- Finetune reversal of poisoning
The attention capture mechanism (in attention_map folder) is adapted from https://github.com/wooyeolBaek/attention-map
Model weights before and after unpoisoning can be found in this Huggingface Repo