Skip to content

Commit 572e647

Browse files
lhoestqdavanstrien
andauthored
Update docs/hub/datasets-download-stats.md
Co-authored-by: Daniel van Strien <[email protected]>
1 parent edf34f1 commit 572e647

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/hub/datasets-download-stats.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## How are download stats generated for datasets?
44

5-
Counting the number of downloads for datasets is not a trivial task, as a single dataset repository might contain multiple files, from multiple subsets and splits (e.g. train/validation/test) and sometimes with many files in a single split. To avoid double counting downloads (e.g., counting a single download of a dataset as multiple downloads), the Hub counts as one download every series of files requests in an interval of 5 minutes per user. No information is sent from the user, and no additional calls are made for this. The count is done server-side as the Hub serves files for downloads. Every HTTP request to the files, including `GET` and `HEAD`, will be counted as a download.
5+
Counting the number of downloads for datasets is not a trivial task, as a single dataset repository might contain multiple files, from multiple subsets and splits (e.g. train/validation/test) and sometimes with many files in a single split. To solve this issue and avoid counting one person's download multiple times, we treat all files downloaded by a user within a 5-minute window as a single dataset download. This counting happens automatically on our servers when files are downloaded (through GET or HEAD requests), with no need to collect any user information or make additional calls.
66

77
## Before Setpember 2024
88

0 commit comments

Comments
 (0)