Skip to content

Commit c2b09e1

Browse files
authored
chore: update source files for the multi-format example (#860)
* chore: update source files for the multi-format example * swap one file to reduce size * delete one image
1 parent d955251 commit c2b09e1

File tree

12 files changed

+13
-0
lines changed

12 files changed

+13
-0
lines changed

examples/multi_format_indexing/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,19 @@ Run:
5555
python main.py
5656
```
5757

58+
## Data Attribution
59+
60+
The example data files used in this demonstration come from the following sources:
61+
62+
### PDF Documents
63+
- **ArXiv Papers**: Research papers sourced from [ArXiv](https://arxiv.org/), an open-access repository of electronic preprints covering various scientific disciplines.
64+
65+
### Image Documents
66+
- **Healthcare Industry Dataset**: Images from the [vidore/syntheticDocQA_healthcare_industry_test](https://huggingface.co/datasets/vidore/syntheticDocQA_healthcare_industry_test) dataset on Hugging Face, which contains synthetic document question-answering data for healthcare industry documents.
67+
- **ESG Reports Dataset**: Images from the [vidore/esg_reports_eng_v2](https://huggingface.co/datasets/vidore/esg_reports_eng_v2) dataset on Hugging Face, containing Environmental, Social, and Governance (ESG) reports.
68+
69+
We thank the creators and maintainers of these datasets for making their data available for research and development purposes.
70+
5871
## About ColPali
5972
This example uses [ColPali](https://github.com/illuin-tech/colpali), a state-of-the-art vision-language model that enables:
6073
- Direct visual understanding of document layouts, tables, and figures
1.09 MB
Binary file not shown.
-403 KB
Binary file not shown.
-986 KB
Binary file not shown.
-40.8 KB
Binary file not shown.
-321 KB
Binary file not shown.
189 KB
Loading
211 KB
Loading
280 KB
Loading
536 KB
Loading

0 commit comments

Comments
 (0)