Skip to content

Commit 90abf82

Browse files
Update documentation
1 parent dc3a1bb commit 90abf82

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+8129
-12
lines changed

README.md

Lines changed: 241 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,251 @@
11
# <center>DD-Ranking</center>
22

3-
## Motivation
3+
<p align="center">
4+
<picture>
5+
<!-- Dark theme logo -->
6+
<source media="(prefers-color-scheme: dark)" srcset="XX.png">
7+
<!-- Light theme logo -->
8+
<img alt="vLLM" src="XX.png" width=55%>
9+
</picture>
10+
</p>
11+
12+
<h3 align="center">
13+
Integrated and easy-to-use benchmark for data-distillation.
14+
</h3>
15+
<p align="center">
16+
| <a href=""><b>Documentation</b></a> | <a href=""><b>Leaderboard</b></a> | <b>Paper </b> (Coming Soon) | <a href=""><b>Twitter/X</b></a> | <a href=""><b>Developer Slack</b></a> |
17+
</p>
18+
19+
20+
---
21+
22+
*Latest News* 🔥
23+
24+
[Latest] We officially released DD-Ranking! DD-Ranking provides us a new benchmark decoupling the impacts from knowledge distillation and data augmentation.
25+
26+
<details>
27+
<summary>Unfold to see more details.</summary>
28+
<br>
29+
- [2024/12] We officially released DD-Ranking! DD-Ranking provides us a new benchmark decoupling the impacts from knowledge distillation and data augmentation.
30+
</details>
31+
32+
---
33+
34+
## Motivation: DD Lacks an Evaluation Benchmark
35+
36+
<details>
37+
<summary>Unfold to see more details.</summary>
438

539
Dataset Distillation (DD) aims to condense a large dataset into a much smaller one, which allows a model to achieve comparable performance after training on it. DD has gained extensive attention since it was proposed. With some foundational methods such as DC, DM, and MTT, various works have further pushed this area to a new standard with their novel designs.
640

7-
Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation.**Hard labels** are categorical, having the same format of the real dataset. **Soft labels** are distributions, typically generated by a pre-trained teacher model.
41+
![history](./static/history.png)
42+
43+
Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation. **Hard labels** are categorical, having the same format of the real dataset. **Soft labels** are distributions, typically generated by a pre-trained teacher model.
44+
Recently, Deng et al., pointed out that "a label is worth a thousand images". They showed analytically that soft labels are exetremely useful for accuracy improvement.
45+
46+
However, since the essence of soft labels is **knowledge distillation**, we want to ask a question: **Can the test accuracy of the model trained on distilled data reflect the real informativeness of the distilled data?**
47+
48+
Specifically, we have discoverd unfairness of using only test accuracy to demonstrate one's performance from the following three aspects:
49+
1. Results of using hard and soft labels are not directly comparable since soft labels introduce teacher knowledge.
50+
2. Strategies of using soft labels are diverse. For instance, different objective functions are used during evaluation, such as soft Cross-Entropy and Kullback–Leibler divergence. Also, one image may be mapped to one or multiple soft labels.
51+
3. Different data augmentations are used during evaluation.
52+
53+
Motivated by this, we propose DD-Ranking, a new benchmark for DD evaluation. DD-Ranking provides a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.
54+
55+
</details>
56+
57+
## About
58+
59+
DD-Ranking (DD, *i.e.*, Dataset Distillation) is an integrated and easy-to-use benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.
60+
61+
<!-- Hard label is tested -->
62+
<!-- Keep the same compression ratio, comparing with random selection -->
63+
**Performance benchmark**
64+
65+
<span style="color: #ffff00;">[To Verify]:</span>Revisit the original goal of dataset distillation:
66+
> The idea is to synthesize a small number of data points that do not need to come from the correct data distribution, but will, when given to the learning algorithm as training data, approximate the model trained on the original data.
67+
>
68+
69+
The evaluation method for DD-Ranking is grounded in the essence of dataset distillation, aiming to better reflect the information content of the synthesized data by assessing the following two aspects:
70+
1. The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR}=\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}}$.
71+
72+
2. The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR}=\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}}$.
73+
$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:
74+
- $\text{full-hard}$: Full dataset with hard labels;
75+
- $\text{syn-hard}$: Synthetic dataset with hard labels;
76+
- $\text{syn-any}$: Synthetic dataset with personalized evaluation methods (hard or soft labels);
77+
- $\text{rdm-any}$: Randomly selected dataset (under the same compression ratio) with the same personalized evaluation methods.
78+
79+
To rank different methods, we combine the above two metrics as follows:
80+
81+
$$\text{IOR}/\text{HLR} = \frac{(\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}})}{(\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}})}$$
82+
83+
DD-Ranking is integrated with:
84+
<!-- Uniform Fair Labels: loss on soft label -->
85+
<!-- Data Aug. -->
86+
- <span style="color: #ffff00;">[To Verify]:</span>Multiple [strategies](https://github.com/NUS-HPC-AI-Lab/DD-Ranking/tree/main/dd_ranking/loss) of using soft labels;
87+
- <span style="color: #ffff00;">[To Verify]:</span>Data augmentation, reconsidered as [optional tricks](https://github.com/NUS-HPC-AI-Lab/DD-Ranking/tree/main/dd_ranking/aug) in DD;
88+
- <span style="color: #ffff00;">[To Verify]:</span>Commonly used [model architectures](https://github.com/NUS-HPC-AI-Lab/DD-Ranking/blob/main/dd_ranking/utils/networks.py) in DD.
89+
<span style="color: #ffff00;">[To Verify]:</span> A new ranking on representative DD methods.
90+
91+
DD-Ranking is flexible and easy to use, supported by:
92+
<!-- Defualt configs: Customized configs -->
93+
<!-- Integrated classes: 1) Optimizer and etc.; 2) random selection tests (additionally, w/ or w/o hard labels)-->
94+
- <span style="color: #ffff00;">[To Verify]:</span>Extensive configs provided;
95+
- <span style="color: #ffff00;">[To Verify]:</span>Cutomized configs;
96+
- <span style="color: #ffff00;">[To Verify]:</span>Testing and training framework with integrated metrics.
97+
98+
## Coming Soon
99+
<span style="color: #ffff00;">[To Verify]:</span>Rank on different data augmentation methods.
100+
<span style="color: #ffff00;">[To Verify]:</span>Rank on different data augmentation methods.
101+
## Tutorial
102+
103+
Install DD-Ranking with `pip` or from [source](https://github.com/NUS-HPC-AI-Lab/DD-Ranking/tree/main):
104+
105+
### Installation
106+
107+
From pip
108+
109+
```bash
110+
pip install dd_ranking
111+
```
112+
113+
From source
114+
115+
```bash
116+
python setup.py install
117+
```
118+
### Quickstart
119+
120+
Below is a step-by-step guide on how to use our `dd_ranking`. This demo is based on soft labels (source code can be found in `demo_soft.py`). You can find hard label demo in `demo_hard.py`.
121+
122+
**Step1**: Intialize a soft-label metric evaluator object. Config files are recommended for users to specify hyper-parameters. Sample config files are provided [here](https://github.com/NUS-HPC-AI-Lab/DD-Ranking/tree/main/configs).
123+
124+
```python
125+
from dd_ranking.metrics import Soft_Label_Objective_Metrics
126+
from dd_ranking.config import Config
127+
128+
config = Config.from_file("./configs/Demo_Soft_Label.yaml")
129+
soft_obj = Soft_Label_Objective_Metrics(config)
130+
```
131+
132+
<details>
133+
<summary>You can also pass keyword arguments.</summary>
134+
135+
```python
136+
device = "cuda"
137+
method_name = "DATM" # Specify your method name
138+
ipc = 10 # Specify your IPC
139+
dataset = "CIFAR10" # Specify your dataset name
140+
syn_data_dir = "./data/CIFAR10/IPC10/" # Specify your synthetic data path
141+
real_data_dir = "./datasets" # Specify your dataset path
142+
model_name = "ConvNet-3" # Specify your model name
143+
teacher_dir = "./teacher_models" # Specify your path to teacher model chcekpoints
144+
im_size = (32, 32) # Specify your image size
145+
dsa_params = { # Specify your data augmentation parameters
146+
"prob_flip": 0.5,
147+
"ratio_rotate": 15.0,
148+
"saturation": 2.0,
149+
"brightness": 1.0,
150+
"contrast": 0.5,
151+
"ratio_scale": 1.2,
152+
"ratio_crop_pad": 0.125,
153+
"ratio_cutout": 0.5
154+
}
155+
save_path = f"./results/{dataset}/{model_name}/IPC{ipc}/dm_hard_scores.csv"
156+
157+
""" We only list arguments that usually need specifying"""
158+
soft_label_metric_calc = Soft_Label_Objective_Metrics(
159+
dataset=dataset,
160+
real_data_path=real_data_dir,
161+
ipc=ipc,
162+
model_name=model_name,
163+
soft_label_criterion='sce', # Use Soft Cross Entropy Loss
164+
soft_label_mode='S', # Use one-to-one image to soft label mapping
165+
data_aug_func='dsa', # Use DSA data augmentation
166+
aug_params=dsa_params, # Specify dsa parameters
167+
im_size=im_size,
168+
stu_use_torchvision=False,
169+
tea_use_torchvision=False,
170+
teacher_dir='./teacher_models',
171+
device=device,
172+
save_path=save_path
173+
)
174+
```
175+
</details>
176+
177+
For detailed explanation for hyper-parameters, please refer to our <a href="">documentation</a>.
178+
179+
**Step 2:** Load your synthetic data, labels (if any), and learning rate (if any).
180+
181+
```python
182+
syn_images = torch.load('/your/path/to/syn/images.pt')
183+
# You must specify your soft labels if your soft label mode is 'S'
184+
soft_labels = torch.load('/your/path/to/syn/labels.pt')
185+
syn_lr = torch.load('/your/path/to/syn/lr.pt')
186+
```
187+
188+
**Step 3:** Compute the xxx metric.
189+
190+
```python
191+
metric = soft_label_metric_calc.compute_metrics(syn_images, soft_labels=soft_labels, syn_lr=syn_lr)
192+
```
193+
194+
The following results will be returned to you:
195+
- `HLR mean`: The mean of hard label recovery over `num_eval` runs.
196+
- `HLR std`: The standard deviation of hard label recovery over `num_eval` runs.
197+
- `IOR mean`: The mean of improvement over random over `num_eval` runs.
198+
- `IOR std`: The standard deviation of improvement over random over `num_eval` runs.
199+
- `IOR/HLR mean`: The mean of IOR/HLR over `num_eval` runs.
200+
- `IOR/HLR std`: The standard deviation of IOR/HLR over `num_eval` runs.
201+
202+
<!-- Our <span style="color: #ff0000;">[TODO]:</span>[documentation]() to learn more.
203+
204+
- [Installation]()
205+
- [Quickstart]()
206+
- [Supported Models]() -->
207+
208+
## Contributing
209+
210+
211+
<!-- Only PR for the 1st version of DD-Ranking -->
212+
Feel free to submit grades to update the DD-Ranking list. We welcome and value any contributions and collaborations.
213+
Please check out [CONTRIBUTING.md](./CONTRIBUTING.md) for how to get involved.
214+
215+
<!-- ## Acknowledgement
216+
217+
DD-Ranking is a community project. The compute resources for development and testing are supported by the following organizations. Thanks for your support! -->
218+
219+
<!-- Note: Please sort them in alphabetical order. -->
220+
<!-- Note: Please keep these consistent with docs/source/community/sponsors.md -->
221+
222+
<!-- - First Org.
223+
224+
We also have an official fundraising venue through <span style="color: #ff0000;">[TODO]:</span>[the collection website](). We plan to use the fund to support the development, maintenance, and adoption of DD-Ranking. -->
225+
226+
<!-- Paper to be added -->
227+
<!-- If a pre-print is wanted, a digital asset could be released first. -->
228+
229+
<!-- ## Citation
230+
231+
If you use DD-Ranking for your research, please cite our [paper]():
232+
```bibtex
233+
@inproceedings{,
234+
title={DD-Ranking: },
235+
author={},
236+
booktitle={},
237+
year={2024}
238+
}
239+
```
240+
241+
<!-- ## Contact Us
8242
9-
However, we notice that DD lacks a unified and fair evaluation benchmark. Issues of current evaluation scheme are summarized as follows:
243+
**Community Discussions**: Engage with other users on <span style="color: #ff0000;">[TODO]:</span>[Discord]() for discussions.
10244
11-
1. Results of using hard and soft labels are not directly comparable. The essence of using soft labels is **knowledge distillation**. When introducing teacher knowledge to train a model on the distilled dataset, the obtained test accuracy may not fully reflect the pure informativeness of the distilled data.
12-
2. Strategies of using soft labels are diverse. We have seen different objective functions during evaluation, such as soft Cross-Entropy and Kullback–Leibler divergence. Since the objective function to use soft labels is usually not a contribution of most methods, it is not fair to compare different methods with different soft label strategies.
13-
3. Data augmentations on distilled datasets are diverse. Different methods may adopt different data augmentations to enhance the model training, which could improve the test accuracy to different extents. Thus, data augmentation should also be properly aligned.
245+
**Coordination of Contributions and Development**: Use <span style="color: #ff0000;">[TODO]:</span>[Slack]() for coordinating contributions and managing development efforts.
14246
15-
With above issues, we point out the major limitation of the current DD evaluation as follows: **the test accuracy of the model trained on distilled data does not equal to the real informativeness of the distilled data.**
247+
**Collaborations and Partnerships**: For exploring collaborations or partnerships, reach out via <span style="color: #ff0000;">[TODO]:</span>[email]().
16248
17-
## DD-Ranking
249+
**Technical Queries and Feature Requests**: Utilize GitHub issues or discussions for addressing technical questions and proposing new features.
18250
251+
**Security Disclosures**: Report security vulnerabilities through GitHub's security advisory feature. -->

book/.nojekyll

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
This file makes sure that Github Pages doesn't process mdBook's output.

0 commit comments

Comments
 (0)