You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: datasets/bookcorpusopen/README.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -160,7 +160,9 @@ The data fields are the same among all splits.
160
160
161
161
### Licensing Information
162
162
163
-
[More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
163
+
The books have been crawled from smashwords.com, see their [terms of service](https://www.smashwords.com/about/tos) for more information.
164
+
165
+
A data sheet for this dataset has also been created and published in [Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus](https://arxiv.org/abs/2105.05241)
164
166
165
167
### Citation Information
166
168
@@ -178,4 +180,4 @@ The data fields are the same among all splits.
178
180
179
181
### Contributions
180
182
181
-
Thanks to [@vblagoje](https://github.com/vblagoje) for adding this dataset.
183
+
Thanks to [@vblagoje](https://github.com/vblagoje) for adding this dataset.
0 commit comments