Skip to content

nashspence/home-index-read

Repository files navigation

Home Index Read

This repository provides an OCR module for Home Index. It exposes an XML-RPC service that reads images and PDF files with EasyOCR. The extracted text is written back to the document so it can be indexed by Home Index.

Quick start

docker compose up

The provided docker-compose.yml launches Meilisearch, Home Index and this module. Once running, Home Index connects to the module at http://home-index-read:9000.

Environment variables

The module behaviour can be tweaked with the following variables (defaults in brackets):

  • NAME – module name [read]
  • LANGUAGES – comma separated list of languages [en]
  • MODEL_STORAGE_DIRECTORY – where EasyOCR stores models [/easyocr]
  • WORKERS – number of EasyOCR workers [1]
  • BATCH_SIZE – batch size for OCR [6]
  • GPU – use the GPU if available [torch.cuda.is_available()]
  • PYTORCH_CUDA_ALLOC_CONF – PyTorch CUDA allocation settings [expandable_segments:True]

Files

  • packages/home_index_read/main.py – module implementation
  • Dockerfile – build the Docker image
  • docker-compose.yml – example compose file for local testing

See the module specification in the Home Index repository for details on the RPC interface.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors