Skip to content

nashspence/home-index-scrape

Repository files navigation

Home Index Scrape

Home Index Scrape is an Home Index RPC module that extracts metadata from files. It collects information using tools such as ExifTool, FFmpeg, MediaInfo and Apache Tika and returns structured metadata for indexing in Meilisearch.

Quick start

The repository provides a docker-compose.yml that starts Home Index, Meilisearch, Apache Tika and this scrape module. After installing Docker run:

docker compose up

Sample files under bind-mounts/files/ will be scanned and metadata is stored next to them in a metadata folder.

Running manually

Install the dependencies and launch the module directly:

pip install -r requirements.txt
python packages/home_index_scrape/main.py

By default the server listens on port 9000. Set NAME to change the module name.

Configuration

Common environment variables used by the module:

  • NAME – module name (default scrape)
  • DEBUG – set to True for verbose logging
  • WAIT_FOR_DEBUGPY_CLIENT – wait for a debugger to attach before starting
  • TIKA_SERVER_ENDPOINT – URL of the Tika server, e.g. http://tika:9998

For a full description of the RPC interface see the Home Index documentation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages