Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions examples/multi_format_indexing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,19 @@ Run:
python main.py
```

## Data Attribution

The example data files used in this demonstration come from the following sources:

### PDF Documents
- **ArXiv Papers**: Research papers sourced from [ArXiv](https://arxiv.org/), an open-access repository of electronic preprints covering various scientific disciplines.

### Image Documents
- **Healthcare Industry Dataset**: Images from the [vidore/syntheticDocQA_healthcare_industry_test](https://huggingface.co/datasets/vidore/syntheticDocQA_healthcare_industry_test) dataset on Hugging Face, which contains synthetic document question-answering data for healthcare industry documents.
- **ESG Reports Dataset**: Images from the [vidore/esg_reports_eng_v2](https://huggingface.co/datasets/vidore/esg_reports_eng_v2) dataset on Hugging Face, containing Environmental, Social, and Governance (ESG) reports.

We thank the creators and maintainers of these datasets for making their data available for research and development purposes.

## About ColPali
This example uses [ColPali](https://github.com/illuin-tech/colpali), a state-of-the-art vision-language model that enables:
- Direct visual understanding of document layouts, tables, and figures
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.