Home Index Scrape

Home Index Scrape is an Home Index RPC module that extracts metadata from files. It collects information using tools such as ExifTool, FFmpeg, MediaInfo and Apache Tika and returns structured metadata for indexing in Meilisearch.

Quick start

The repository provides a docker-compose.yml that starts Home Index, Meilisearch, Apache Tika and this scrape module. After installing Docker run:

docker compose up

Sample files under bind-mounts/files/ will be scanned and metadata is stored next to them in a metadata folder.

Running manually

Install the dependencies and launch the module directly:

pip install -r requirements.txt
python packages/home_index_scrape/main.py

By default the server listens on port 9000. Set NAME to change the module name.

Configuration

Common environment variables used by the module:

NAME – module name (default scrape)
DEBUG – set to True for verbose logging
WAIT_FOR_DEBUGPY_CLIENT – wait for a debugger to attach before starting
TIKA_SERVER_ENDPOINT – URL of the Tika server, e.g. http://tika:9998

For a full description of the RPC interface see the Home Index documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
bind-mounts		bind-mounts
packages/home_index_scrape		packages/home_index_scrape
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Home Index Scrape

Quick start

Running manually

Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Home Index Scrape

Quick start

Running manually

Configuration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages