CUDA out of memory when embedding and indexing corpus

**Describe the bug**
CUDA is out of memory while using the default Retriever Server with multimodal embedding model colpali-v1.3-merged to embed 6492 images. I suppose it's because the retriever attempts to read the entire corpus at once and embed it in a single batch. Would you please consider optimizing the tools (`retriever.retriever_init`, `retriever.retriever_embed`, `retriever.retriever_index`)? Thanks.

**To Reproduce**
pipeline parameter files:
```yaml
# courpus_index_parameter.yaml
retriever:
  corpus_path: data/corpus.jsonl
  cuda_devices: 4,5,6,7
  embedding_path: embedding/embedding.npy
  faiss_use_gpu: true
  index_chunk_size: 50000
  index_path: index/index.index
  infinity_kwargs:
    batch_size: 256
    bettertransformer: false
    device: cuda
    model_warmup: false
    pooling_method: auto
    engine: torch
  is_multimodal: true
  overwrite: true
  retriever_path: vidore/colpali-v1.3-merged
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA out of memory when embedding and indexing corpus #96

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA out of memory when embedding and indexing corpus #96

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions