Skip to content

Commit 222a3c2

Browse files
author
SebastienMelo
committed
2 parents 1385cf1 + 13860a8 commit 222a3c2

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

python_scripts/cross_validation_grouping.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -110,9 +110,10 @@
110110
print(digits.DESCR)
111111

112112
# %% [markdown]
113-
# If we read carefully, 13 writers wrote the digits of our dataset, accounting
114-
# for a total amount of 1797 samples. Thus, a writer wrote several times the
115-
# same numbers. Let's suppose that the writer samples are grouped. Subsequently,
113+
# If we read carefully, `load_digits` loads a copy of the **test set** of the
114+
# UCI ML hand-written digits dataset, which consists of 1797 images by
115+
# **13 different writers**. Thus, each writer wrote several times the same
116+
# numbers. Let's suppose the dataset is ordered by writer. Subsequently,
116117
# not shuffling the data will keep all writer samples together either in the
117118
# training or the testing sets. Mixing the data will break this structure, and
118119
# therefore digits written by the same writer will be available in both the

0 commit comments

Comments
 (0)