Skip to content

Commit 8a450fc

Browse files
Revise title and dataset source in summaries.md
Updated the title and modified the dataset collection description.
1 parent b2647ef commit 8a450fc

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

notes/summaries.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "AELLA (Autonomous Extraction of Linked Literature for Accessibility): The Inference.net × LAION × Grass Initiative"
2+
title: "AELLA (Autonomous Extraction of Linked Literature for Accessibility): The Inference.net × LAION"
33
author: "Christoph Schuhmann, Amarjot Singh, Andrii Prolorenzo, Andrej Radonjic, Sean Smith, and Sam Hogan"
44
date: "November 11 2025"
55
previewImg: "/images/blog/sci3.jpg"
@@ -42,7 +42,7 @@ Access to scientific knowledge remains constrained by paywalls, licensing, and c
4242

4343
### 2.1 Dataset Collection & Processing
4444

45-
Primary corpus: ~**100M** research papers retrieved via collaboration with **Wynd Labs** using the **Grass** network. After deduplication, we **supplemented** with: *
45+
Primary corpus: ~**100M** research papers retrieved from the public internet. After deduplication, we **supplemented** with: *
4646
**bethgelab**: *paper_parsed_jsons* ([dataset](https://huggingface.co/datasets/bethgelab/paper_parsed_jsons)) *
4747

4848
**LAION**: *COREX-18text* ([dataset](https://huggingface.co/datasets/laion/COREX-18text)) *

0 commit comments

Comments
 (0)