This document explains how to run the RTEB (Retrieval Embedding Benchmark) application using Docker.
- Docker installed on your system
- Docker Compose installed on your system
- NVIDIA Docker runtime (only if you need GPU support)
-
Make sure Docker is running on your system.
-
Run the application with default settings:
./run_rteb.sh
This will use the application's built-in defaults:
- Data path: "data/"
- Save path: "output/"
- CPU mode (no GPUs)
-
Run with custom arguments:
./run_rteb.sh --gpus 2 --batch_size 32 --save_embds
All arguments supported by the RTEB application can be passed directly to the Docker container. Here are some common ones:
--gpus <num>: Number of GPUs to use (default: 0, requires NVIDIA Docker runtime)--cpus <num>: Number of CPUs to use (default: 1)--batch_size <num>: Batch size for encoding (default: 16)--data_path <path>: Path to the dataset (default: /app/data)--save_path <path>: Path to save output (default: /app/output)--save_embds: Save embeddings--load_embds: Load pre-computed embeddings--overwrite: Overwrite existing results
For a complete list of arguments, run:
./run_rteb.sh --helpThe Docker setup includes:
- A Docker image with all necessary dependencies
- Volume mounts for data and output
- Optional GPU support for accelerated processing (requires NVIDIA Docker runtime)
- Memory limits to prevent out-of-memory errors
To modify the Docker environment:
- Edit
docker-compose.ymlto change resource limits or volume mounts - Edit
Dockerfileto modify the base image or installed dependencies - Edit
docker-entrypoint.shto change default arguments or startup behavior
After making changes, rebuild the Docker image:
sudo docker-compose build