Skip to content

Commit ad2f7ea

Browse files
Update
1 parent dee4fdc commit ad2f7ea

File tree

8 files changed

+24
-32
lines changed

8 files changed

+24
-32
lines changed

book/index.html

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -152,11 +152,10 @@ <h1 class="menu-title">DD-Ranking Benchmark</h1>
152152

153153
<div id="content" class="content">
154154
<main>
155-
<h1 align="center">DD-Ranking</h1>
156-
<p>DD-Ranking (DD, <em>i.e.</em>, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.</p>
155+
<p>DD-Ranking (DD, <em>i.e.</em>, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.</p>
157156
<h2 id="motivation"><a class="header" href="#motivation">Motivation</a></h2>
158157
<p>Dataset Distillation (DD) aims to condense a large dataset into a much smaller one, which allows a model to achieve comparable performance after training on it. DD has gained extensive attention since it was proposed. With some foundational methods such as DC, DM, and MTT, various works have further pushed this area to a new standard with their novel designs.</p>
159-
<p><img src="../static/history.png" alt="history" /></p>
158+
<p><img src="static/history.png" alt="history" /></p>
160159
<p>Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation. <strong>Hard labels</strong> are categorical, having the same format of the real dataset. <strong>Soft labels</strong> are distributions, typically generated by a pre-trained teacher model.
161160
Recently, Deng et al., pointed out that "a label is worth a thousand images". They showed analytically that soft labels are exetremely useful for accuracy improvement.</p>
162161
<p>However, since the essence of soft labels is <strong>knowledge distillation</strong>, we want to ask a question: <strong>Can the test accuracy of the model trained on distilled data reflect the real informativeness of the distilled data?</strong></p>
@@ -182,21 +181,21 @@ <h2 id="dd-ranking-score"><a class="header" href="#dd-ranking-score">DD-Ranking
182181
<p>The evaluation method for DD-Ranking is grounded in the essence of dataset distillation, aiming to better reflect the information content of the synthesized data by assessing the following two aspects:</p>
183182
<ol>
184183
<li>
185-
<p>The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR}=\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}}$.</p>
184+
<p>The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR} = \text{Acc}<em>{\text{full-hard}} - \text{Acc}</em>{\text{syn-hard}}$</p>
186185
</li>
187186
<li>
188-
<p>The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR}=\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}}$.
189-
$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:</p>
187+
<p>The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR} = \text{Acc}<em>{\text{syn-any}} - \text{Acc}</em>{\text{rdm-any}}$</p>
190188
</li>
191189
</ol>
190+
<p>$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:</p>
192191
<ul>
193192
<li>$\text{full-hard}$: Full dataset with hard labels;</li>
194193
<li>$\text{syn-hard}$: Synthetic dataset with hard labels;</li>
195194
<li>$\text{syn-any}$: Synthetic dataset with personalized evaluation methods (hard or soft labels);</li>
196195
<li>$\text{rdm-any}$: Randomly selected dataset (under the same compression ratio) with the same personalized evaluation methods.</li>
197196
</ul>
198197
<p>To rank different methods, we combine the above two metrics as DD-Ranking Score:</p>
199-
<p>$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}})}{(\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}})}$$</p>
198+
<p>$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc}<em>{\text{syn-any}}-\text{Acc}</em>{\text{rdm-any}})}{(\text{Acc}<em>{\text{full-hard}}-\text{Acc}</em>{\text{syn-hard}})}$$</p>
200199

201200
</main>
202201

book/introduction.html

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -152,11 +152,10 @@ <h1 class="menu-title">DD-Ranking Benchmark</h1>
152152

153153
<div id="content" class="content">
154154
<main>
155-
<h1 align="center">DD-Ranking</h1>
156-
<p>DD-Ranking (DD, <em>i.e.</em>, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.</p>
155+
<p>DD-Ranking (DD, <em>i.e.</em>, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.</p>
157156
<h2 id="motivation"><a class="header" href="#motivation">Motivation</a></h2>
158157
<p>Dataset Distillation (DD) aims to condense a large dataset into a much smaller one, which allows a model to achieve comparable performance after training on it. DD has gained extensive attention since it was proposed. With some foundational methods such as DC, DM, and MTT, various works have further pushed this area to a new standard with their novel designs.</p>
159-
<p><img src="../static/history.png" alt="history" /></p>
158+
<p><img src="static/history.png" alt="history" /></p>
160159
<p>Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation. <strong>Hard labels</strong> are categorical, having the same format of the real dataset. <strong>Soft labels</strong> are distributions, typically generated by a pre-trained teacher model.
161160
Recently, Deng et al., pointed out that "a label is worth a thousand images". They showed analytically that soft labels are exetremely useful for accuracy improvement.</p>
162161
<p>However, since the essence of soft labels is <strong>knowledge distillation</strong>, we want to ask a question: <strong>Can the test accuracy of the model trained on distilled data reflect the real informativeness of the distilled data?</strong></p>
@@ -182,21 +181,21 @@ <h2 id="dd-ranking-score"><a class="header" href="#dd-ranking-score">DD-Ranking
182181
<p>The evaluation method for DD-Ranking is grounded in the essence of dataset distillation, aiming to better reflect the information content of the synthesized data by assessing the following two aspects:</p>
183182
<ol>
184183
<li>
185-
<p>The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR}=\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}}$.</p>
184+
<p>The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR} = \text{Acc}<em>{\text{full-hard}} - \text{Acc}</em>{\text{syn-hard}}$</p>
186185
</li>
187186
<li>
188-
<p>The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR}=\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}}$.
189-
$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:</p>
187+
<p>The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR} = \text{Acc}<em>{\text{syn-any}} - \text{Acc}</em>{\text{rdm-any}}$</p>
190188
</li>
191189
</ol>
190+
<p>$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:</p>
192191
<ul>
193192
<li>$\text{full-hard}$: Full dataset with hard labels;</li>
194193
<li>$\text{syn-hard}$: Synthetic dataset with hard labels;</li>
195194
<li>$\text{syn-any}$: Synthetic dataset with personalized evaluation methods (hard or soft labels);</li>
196195
<li>$\text{rdm-any}$: Randomly selected dataset (under the same compression ratio) with the same personalized evaluation methods.</li>
197196
</ul>
198197
<p>To rank different methods, we combine the above two metrics as DD-Ranking Score:</p>
199-
<p>$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}})}{(\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}})}$$</p>
198+
<p>$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc}<em>{\text{syn-any}}-\text{Acc}</em>{\text{rdm-any}})}{(\text{Acc}<em>{\text{full-hard}}-\text{Acc}</em>{\text{syn-hard}})}$$</p>
200199

201200
</main>
202201

book/print.html

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -153,11 +153,10 @@ <h1 class="menu-title">DD-Ranking Benchmark</h1>
153153

154154
<div id="content" class="content">
155155
<main>
156-
<h1 align="center">DD-Ranking</h1>
157-
<p>DD-Ranking (DD, <em>i.e.</em>, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.</p>
156+
<p>DD-Ranking (DD, <em>i.e.</em>, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.</p>
158157
<h2 id="motivation"><a class="header" href="#motivation">Motivation</a></h2>
159158
<p>Dataset Distillation (DD) aims to condense a large dataset into a much smaller one, which allows a model to achieve comparable performance after training on it. DD has gained extensive attention since it was proposed. With some foundational methods such as DC, DM, and MTT, various works have further pushed this area to a new standard with their novel designs.</p>
160-
<p><img src="../static/history.png" alt="history" /></p>
159+
<p><img src="static/history.png" alt="history" /></p>
161160
<p>Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation. <strong>Hard labels</strong> are categorical, having the same format of the real dataset. <strong>Soft labels</strong> are distributions, typically generated by a pre-trained teacher model.
162161
Recently, Deng et al., pointed out that "a label is worth a thousand images". They showed analytically that soft labels are exetremely useful for accuracy improvement.</p>
163162
<p>However, since the essence of soft labels is <strong>knowledge distillation</strong>, we want to ask a question: <strong>Can the test accuracy of the model trained on distilled data reflect the real informativeness of the distilled data?</strong></p>
@@ -183,21 +182,21 @@ <h2 id="dd-ranking-score"><a class="header" href="#dd-ranking-score">DD-Ranking
183182
<p>The evaluation method for DD-Ranking is grounded in the essence of dataset distillation, aiming to better reflect the information content of the synthesized data by assessing the following two aspects:</p>
184183
<ol>
185184
<li>
186-
<p>The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR}=\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}}$.</p>
185+
<p>The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR} = \text{Acc}<em>{\text{full-hard}} - \text{Acc}</em>{\text{syn-hard}}$</p>
187186
</li>
188187
<li>
189-
<p>The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR}=\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}}$.
190-
$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:</p>
188+
<p>The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR} = \text{Acc}<em>{\text{syn-any}} - \text{Acc}</em>{\text{rdm-any}}$</p>
191189
</li>
192190
</ol>
191+
<p>$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:</p>
193192
<ul>
194193
<li>$\text{full-hard}$: Full dataset with hard labels;</li>
195194
<li>$\text{syn-hard}$: Synthetic dataset with hard labels;</li>
196195
<li>$\text{syn-any}$: Synthetic dataset with personalized evaluation methods (hard or soft labels);</li>
197196
<li>$\text{rdm-any}$: Randomly selected dataset (under the same compression ratio) with the same personalized evaluation methods.</li>
198197
</ul>
199198
<p>To rank different methods, we combine the above two metrics as DD-Ranking Score:</p>
200-
<p>$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}})}{(\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}})}$$</p>
199+
<p>$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc}<em>{\text{syn-any}}-\text{Acc}</em>{\text{rdm-any}})}{(\text{Acc}<em>{\text{full-hard}}-\text{Acc}</em>{\text{syn-hard}})}$$</p>
201200
<div style="break-before: page; page-break-before: always;"></div><h2 id="installation"><a class="header" href="#installation">Installation</a></h2>
202201
<p>From pip</p>
203202
<pre><code class="language-bash">pip install dd_ranking

book/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

book/searchindex.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

book/static/history.png

73.6 KB
Loading

doc/introduction.md

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,8 @@
1-
21
DD-Ranking (DD, *i.e.*, Dataset Distillation) is an integrated and easy-to-use evaluation benchmark for dataset distillation. It aims to provide a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.
3-
42
## Motivation
5-
63
Dataset Distillation (DD) aims to condense a large dataset into a much smaller one, which allows a model to achieve comparable performance after training on it. DD has gained extensive attention since it was proposed. With some foundational methods such as DC, DM, and MTT, various works have further pushed this area to a new standard with their novel designs.
74

8-
![history](../static/history.png)
5+
![history](static/history.png)
96

107
Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation. **Hard labels** are categorical, having the same format of the real dataset. **Soft labels** are distributions, typically generated by a pre-trained teacher model.
118
Recently, Deng et al., pointed out that "a label is worth a thousand images". They showed analytically that soft labels are exetremely useful for accuracy improvement.
@@ -33,11 +30,9 @@ Revisit the original goal of dataset distillation:
3330
>
3431
3532
The evaluation method for DD-Ranking is grounded in the essence of dataset distillation, aiming to better reflect the information content of the synthesized data by assessing the following two aspects:
36-
1. The degree to which the original dataset is recovered under hard labels (hard label recovery):
37-
$\text{HLR}=\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}}$.
33+
1. The degree to which the original dataset is recovered under hard labels (hard label recovery): $\text{HLR} = \text{Acc.} \text{full-hard} - \text{Acc.} \text{syn-hard}$
3834

39-
2. The improvement over random selection when using personalized evaluation methods (improvement over random):
40-
$\text{IOR}=\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}}$.
35+
2. The improvement over random selection when using personalized evaluation methods (improvement over random): $\text{IOR} = \text{Acc.} \text{syn-any} - \text{Acc.} \text{rdm-any}$
4136

4237
$\text{Acc.}$ is the accuracy of models trained on different samples. Samples' marks are as follows:
4338

@@ -48,5 +43,5 @@ $\text{Acc.}$ is the accuracy of models trained on different samples. Samples' m
4843

4944
To rank different methods, we combine the above two metrics as DD-Ranking Score:
5045

51-
$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc.}{\text{syn-any}}-\text{Acc.}{\text{rdm-any}})}{(\text{Acc.}{\text{full-hard}}-\text{Acc.}{\text{syn-hard}})}$$
46+
$$\text{DD-Ranking Score} = \frac{\text{IOR}}{\text{HLR}} = \frac{(\text{Acc.} \text{syn-any}-\text{Acc.} \text{rdm-any})}{(\text{Acc.} \text{full-hard}-\text{Acc.} \text{syn-hard})}$$
5247

doc/static/history.png

73.6 KB
Loading

0 commit comments

Comments
 (0)