Home Index Transcribe is an RPC module for Home Index. It uses WhisperX to generate transcripts and subtitle files from audio and video. The module implements the RPC interface described in the Home Index project so that transcripts can be indexed and searched.
The docker-compose.yml in this repository launches a small stack containing Home Index, Meilisearch and this transcribe module. After installing Docker, run:
docker compose upDrop media files into bind-mounts/files/ and Home Index will automatically invoke the module to produce transcripts. Metadata and cache files are stored under bind-mounts/.
Several environment variables control how WhisperX runs:
NAME– module name (transcribeby default)DEVICE– compute device (cudaorcpu)WHISPER_MODEL– Whisper model variant (e.g.medium)BATCH_SIZE– batch size for transcriptionCOMPUTE_TYPE– computation precision (int8by default)LANGUAGE– language code (enby default)PYANNOTE_DIARIZATION_AUTH_TOKEN– token for speaker diarizationBASE_TIMESTAMP_PATH– dot-delimited path within the document containing the creation timestamp used when prefixing transcript chunks
If you want to start the RPC server without Docker:
pip install -r requirements.txt
python packages/home_index_transcribe/main.pyThe server listens on 0.0.0.0:9000. Add its endpoint to the MODULES environment variable of Home Index so it is called during indexing.
See the Home Index documentation for details on the RPC module specification and additional configuration options.