Skip to content

Commit 6e7f453

Browse files
committed
first commit
0 parents  commit 6e7f453

34 files changed

+6669
-0
lines changed

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
.idea/
2+
*.log
3+
*.pyc
4+
__pycache__/
5+
datasets/
6+
outputs/

README.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Consensus-Driven Distillation for Trustworthy Explanations in Self-Interpretable GNNs
2+
3+
**💻 Official implementation of our TPAMI submission (extension of ICML 2025): Consensus-Driven Distillation for Trustworthy Explanations in Self-Interpretable GNNs**
4+
5+
> 🧠 Authors: [Wenxin Tai](https://scholar.google.com/citations?user=YyxocAIAAAAJ&hl=en), [Fan Zhou](https://scholar.google.com/citations?user=Ihj2Rw8AAAAJ&hl=en), [Ting Zhong](https://scholar.google.com/citations?user=Mdr0XDkAAAAJ&hl=en), [Goce Trajcevski](https://scholar.google.com/citations?user=Avus2kcAAAAJ&hl=en), [Kunpeng Zhang](https://scholar.google.com/citations?user=rnpemAoAAAAJ&hl=en&oi=ao), [Jing Gao](https://scholar.google.com/citations?user=Ftj1h4cAAAAJ&hl=en&oi=ao), [Philip S. Yu](https://scholar.google.com/citations?user=D0lL1r0AAAAJ&hl=en&oi=ao)
6+
> 📍 Institutions: University of Electronic Science and Technology of China & Iowa State University & University of Maryland, College Park & Purdue University & University of Illinois, Chicago.
7+
> 🔗 [Paper Link](https://icml.cc/virtual/2025/poster/44426)
8+
> 🤖 This repository is maintained by [ICDM Lab](https://www.icdmlab.com/)
9+
10+
---
11+
12+
## 🧩 Overview
13+
14+
<p align="center">
15+
<img src="assets/intro.png" width="100%" />
16+
</p>
17+
18+
**TL;DR: Our ICML paper proposed Explanation Ensemble (EE), which improves the trustworthiness of SI-GNNs by aggregating multiple explanations from independently trained models. While effective, it has high computational cost during inference (limits its deployment) and is incompatible with single-explanation metrics such as FID (limits its evaluation). In this extension, we propose Consensus Distillation (CD), which distills the ensemble’s consensus knowledge into a single model, retaining EE’s capability while addressing its limitations.
19+
20+
---
21+
22+
## 📦 Repository Structure
23+
24+
```bash
25+
├── assets
26+
├── configs # configuration
27+
├── criterion.py # loss function
28+
├── dataloader.py # load data
29+
├── dataset.py # process data
30+
├── datasets # raw dataset
31+
├── explainer.py # explainer in self-interpretable GNNs (MLP)
32+
├── main.py # entry
33+
├── model.py # GNN backbone (GIN/GCN)
34+
├── outputs # checkpoints/logs
35+
├── README.md
36+
├── run.sh
37+
└── trainer.py # train/valid/test
38+
````
39+
40+
---
41+
42+
## ⚙️ Installation
43+
44+
We recommend creating a fresh Python environment (e.g., with conda):
45+
46+
```bash
47+
conda create -n exgnn python=3.9
48+
conda activate exgnn
49+
pip install -r requirements.txt
50+
```
51+
52+
---
53+
54+
## 📚 Datasets
55+
56+
We evaluate our method on a variety of datasets:
57+
58+
* Synthetic: BA-2MOTIFS
59+
* Molecular: MUTAGENICITY, 3MR, BENZENE
60+
61+
Datasets can be downloaded from [Google Drive](https://drive.google.com/drive/folders/1RaOKbWABerHfea_sJZGIbXSy0FcOzK0O?usp=sharing), place all datasets (e.g., `ba_2motifs`, `benzene`, `mr`, `mutag`) in the `datasets/` folder.
62+
63+
---
64+
65+
## 🏃‍♀️ Quick Start
66+
67+
### 1. Train self-interpretable GNNs
68+
69+
```bash
70+
python main.py --run_time 10 --dataset ba_2motifs --method gsat_cd
71+
```
72+
73+
74+
### 2. Evaluate redundancy (SHD and AUC)
75+
76+
```bash
77+
python main.py --run_time 10 --dataset ba_2motifs --method gsat_cd --calculate_shd
78+
```
79+
80+
```bash
81+
python main.py --run_time 10 --dataset ba_2motifs --method gsat_cd --test_by_sample_ensemble
82+
```
83+
84+
---
85+
86+
## 📁 Pretrained Checkpoints
87+
We provide **pretrained model checkpoints** for quick evaluation and reproduction.
88+
89+
You can download them from the [Releases](https://github.com/ICDM-UESTC/ConsensusDistillation/releases) tab
90+
91+
To use the checkpoint, place it in the `outputs/checkpoints/` folder and run:
92+
```bash
93+
python main.py --run_time 10 --dataset ba_2motifs --method gsat_cd --calculate_shd
94+
python main.py --run_time 10 --dataset ba_2motifs --method gsat_cd --test_by_sample_ensemble
95+
```
96+
97+
---
98+
99+
## 📌 Citation
100+
101+
If you find this work useful, please cite us:
102+
103+
```bibtex
104+
@inproceedings{tai2025redundancy,
105+
title = {Redundancy Undermines the Trustworthiness of Self-Interpretable GNNs},
106+
author = {Tai, Wenxin and Zhong, Ting and Trajcevski, Goce and Zhou, Fan},
107+
booktitle = {Proceedings of the 42nd International Conference on Machine Learning (ICML)},
108+
year = {2025}
109+
}
110+
```
111+
112+
---
113+
114+
## 📬 Contact
115+
116+
If you have questions or suggestions, feel free to reach out via GitHub Issues or email: wxtai [AT] outlook [DOT] com

assets/intro.png

216 KB
Loading

configs/dataset/ba_2motifs.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
dataset_name: 'ba_2motifs'
2+
dataset_root: '/home/icdm/twx/TrustworthyExplanation//datasets'
3+
data_split_ratio: [0.8, 0.1, 0.1]
4+
num_class: 2
5+
multi_label: false

configs/dataset/benzene.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
dataset_name: 'benzene'
2+
dataset_root: '/home/icdm/twx/TrustworthyExplanation//datasets'
3+
data_split_ratio: [0.8, 0.1, 0.1]
4+
num_class: 2
5+
multi_label: false

configs/dataset/mr.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
dataset_name: 'mr'
2+
dataset_root: '/home/icdm/twx/TrustworthyExplanation//datasets'
3+
data_split_ratio: [0.8, 0.1, 0.1]
4+
num_class: 2
5+
multi_label: false

configs/dataset/mutag.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
dataset_name: 'mutag'
2+
dataset_root: '/home/icdm/twx/TrustworthyExplanation//datasets'
3+
data_split_ratio: [0.8, 0.1, 0.1]
4+
num_class: 2
5+
multi_label: false

configs/global.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
#hydra:
2+
# run:
3+
## dir: ""
4+
# dir: logs/${now:%Y-%m-%d}/${now:%H-%M-%S}
5+
6+
defaults:
7+
- dataset: ba_2motifs
8+
- method: gsat
9+
- _self_
10+
11+
device_id: 0
12+
save_dir: ./outputs

configs/method/att.yaml

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
method_name: att
2+
3+
ba_2motifs:
4+
dataset_name: ba_2motifs
5+
num_class: 2
6+
multi_label: false
7+
8+
backbone_name: gin
9+
hidden_size: 64
10+
n_layers: 2 # gin 2 gcn 3
11+
dropout_p: 0.3
12+
atom_encoder: false
13+
node_attr_dim: 10
14+
edge_attr_dim: 0
15+
use_edge_attr: true
16+
learn_edge_att: true
17+
explainer_dropout_p: 0.5
18+
19+
epochs: 100
20+
lr: 1.0e-3
21+
weight_decay: 0
22+
batch_size: 128
23+
24+
decay_r: 0.1
25+
init_r: 0.9
26+
final_r: 0.5
27+
decay_interval: 10
28+
29+
ce_loss_coef: 1
30+
31+
mutag:
32+
dataset_name: mutag
33+
num_class: 2
34+
multi_label: false
35+
36+
backbone_name: gin
37+
hidden_size: 64
38+
n_layers: 2 # gin 2 gcn 3
39+
dropout_p: 0.3
40+
atom_encoder: false
41+
node_attr_dim: 14
42+
edge_attr_dim: 0
43+
use_edge_attr: true
44+
learn_edge_att: false
45+
explainer_dropout_p: 0.5
46+
47+
epochs: 100
48+
lr: 1.0e-3
49+
weight_decay: 0
50+
batch_size: 128
51+
52+
decay_r: 0.1
53+
init_r: 0.9
54+
final_r: 0.5
55+
decay_interval: 10
56+
57+
ce_loss_coef: 1
58+
59+
benzene:
60+
dataset_name: benzene
61+
num_class: 2
62+
multi_label: false
63+
64+
backbone_name: gin
65+
hidden_size: 64
66+
n_layers: 2 # gin 2 gcn 3
67+
dropout_p: 0.3
68+
atom_encoder: false
69+
node_attr_dim: 14
70+
edge_attr_dim: 0
71+
use_edge_attr: true
72+
learn_edge_att: false
73+
explainer_dropout_p: 0.5
74+
75+
epochs: 100
76+
lr: 1.0e-3
77+
weight_decay: 0
78+
batch_size: 128
79+
80+
decay_r: 0.1
81+
init_r: 0.9
82+
final_r: 0.5
83+
decay_interval: 10
84+
85+
ce_loss_coef: 1
86+
87+
mr:
88+
dataset_name: mr
89+
num_class: 2
90+
multi_label: false
91+
92+
backbone_name: gin
93+
hidden_size: 64
94+
n_layers: 2 # gin 2 gcn 3
95+
dropout_p: 0.3
96+
atom_encoder: false
97+
node_attr_dim: 14
98+
edge_attr_dim: 0
99+
use_edge_attr: true
100+
learn_edge_att: false
101+
explainer_dropout_p: 0.5
102+
103+
epochs: 100
104+
lr: 1.0e-3
105+
weight_decay: 0
106+
batch_size: 128
107+
108+
decay_r: 0.1
109+
init_r: 0.9
110+
final_r: 0.5
111+
decay_interval: 10
112+
113+
ce_loss_coef: 1

0 commit comments

Comments
 (0)