You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We present a comprehensive approach to democratizing access to scientific knowledge through large-scale, **structured summarization** of academic literature.
13
-
We retrieved and processed ~**100 million** research papers from the public internet, leveraging existing datasets from **bethgelab**, **PeS2o**, **Hugging Face**, and **Common Pile**.
13
+
We retrieved and processed ~**100 million** research papers from the public internet, leveraging existing datasets from **bethgelab**, **PeS2o**, **Hugging Face**, and **Common Pile**.
14
14
15
15
<palign="center">
16
16
<img src="/images/blog/sci5.png"
@@ -42,7 +42,7 @@ Access to scientific knowledge remains constrained by paywalls, licensing, and c
42
42
43
43
### 2.1 Dataset Collection & Processing
44
44
45
-
Primary corpus: ~**100M** research papers retrieved from the public internet. After deduplication, we **supplemented** with: *
45
+
Primary corpus: ~**100M** research papers retrieved from the public internet through a collaboration with Grass. After deduplication, we **supplemented** with: *
@@ -194,7 +194,7 @@ We invite **researchers, librarians, and open-access advocates** to help us **ga
194
194
195
195
## Acknowledgments
196
196
197
-
This is a collaboration between **LAION** and **Inference.net**. We thank all contributors, especially **Tawsif Ratul** for data collection, and **Prof. Sören Auer**, **Dr. Gollam Rabby**, and the **TIB – Leibniz Information Centre for Science and Technology** for scientific advice and support.
197
+
This is a collaboration between **LAION**, **Grass** and **Inference.net**. We thank all contributors, especially **Tawsif Ratul** for data collection, and **Prof. Sören Auer**, **Dr. Gollam Rabby**, and the **TIB – Leibniz Information Centre for Science and Technology** for scientific advice and support.
0 commit comments