Skip to content

CUDA out of memory when embedding and indexing corpus #96

@dsyislearning

Description

@dsyislearning

Describe the bug
CUDA is out of memory while using the default Retriever Server with multimodal embedding model colpali-v1.3-merged to embed 6492 images. I suppose it's because the retriever attempts to read the entire corpus at once and embed it in a single batch. Would you please consider optimizing the tools (retriever.retriever_init, retriever.retriever_embed, retriever.retriever_index)? Thanks.

To Reproduce
pipeline parameter files:

# courpus_index_parameter.yaml
retriever:
  corpus_path: data/corpus.jsonl
  cuda_devices: 4,5,6,7
  embedding_path: embedding/embedding.npy
  faiss_use_gpu: true
  index_chunk_size: 50000
  index_path: index/index.index
  infinity_kwargs:
    batch_size: 256
    bettertransformer: false
    device: cuda
    model_warmup: false
    pooling_method: auto
    engine: torch
  is_multimodal: true
  overwrite: true
  retriever_path: vidore/colpali-v1.3-merged

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions