You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/introduction.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,14 +11,16 @@ Dataset Distillation (DD) aims to condense a large dataset into a much smaller o
11
11
Notebaly, more and more methods are transitting from "hard label" to "soft label" in dataset distillation, especially during evaluation. **Hard labels** are categorical, having the same format of the real dataset. **Soft labels** are distributions, typically generated by a pre-trained teacher model.
12
12
Recently, Deng et al., pointed out that "a label is worth a thousand images". They showed analytically that soft labels are exetremely useful for accuracy improvement.
13
13
14
-
However, since the essence of soft labels is **knowledge distillation**, we want to ask a question: **Can the test accuracy of the model trained on distilled data reflect the real informativeness of the distilled data?**
14
+
However, since the essence of soft labels is **knowledge distillation**, we find that when applying the same evaluation method to randomly selected data, the test accuracy also improves significantly (see the figure above).
15
15
16
-
Specifically, we have discoverd unfairness of using only test accuracy to demonstrate one's performance from the following three aspects:
16
+
This makes us wonder: **Can the test accuracy of the model trained on distilled data reflect the real informativeness of the distilled data?**
17
+
18
+
Additionally, we have discoverd unfairness of using only test accuracy to demonstrate one's performance from the following three aspects:
17
19
1. Results of using hard and soft labels are not directly comparable since soft labels introduce teacher knowledge.
18
20
2. Strategies of using soft labels are diverse. For instance, different objective functions are used during evaluation, such as soft Cross-Entropy and Kullback–Leibler divergence. Also, one image may be mapped to one or multiple soft labels.
19
21
3. Different data augmentations are used during evaluation.
20
22
21
-
Motivated by this, we propose DD-Ranking, a new benchmark for DD evaluation. DD-Ranking provides a fair evaluation scheme for DD methods that can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.
23
+
Motivated by this, we propose DD-Ranking, a new benchmark for DD evaluation. DD-Ranking provides a fair evaluation scheme for DD methods, and can decouple the impacts from knowledge distillation and data augmentation to reflect the real informativeness of the distilled data.
0 commit comments