Epstein Archive

Large-scale document archive system with full-text search capabilities.

Overview

A comprehensive document archive indexing 100,000+ documents with:

Full-text search with OCR fallback for scanned documents
Metadata extraction and entity recognition
Timeline visualization and relationship mapping
PDF generation and document export
RESTful API for programmatic access

Live Instance

URL: https://hmm.raya.li

Features

Full-Text Search: Elasticsearch-powered search across all documents
OCR Support: Tesseract OCR for scanned/image-based documents
Entity Recognition: Automatic extraction of names, dates, locations
Timeline View: Chronological visualization of document events
Export: PDF generation and bulk document download
API: RESTful endpoints for research integration

Tech Stack

Backend: Python, Flask
Search: Elasticsearch
Database: PostgreSQL
OCR: Tesseract
Frontend: Vue.js
Hosting: Self-hosted on OVH infrastructure

API Usage

# Search documents
curl "https://hmm.raya.li/api/search?q=query&limit=20"

# Get document by ID
curl "https://hmm.raya.li/api/documents/{id}"

# Export as PDF
curl "https://hmm.raya.li/api/documents/{id}/pdf"

Data Source

Documents are sourced from public court records, FOIA releases, and other public domain sources.

Disclaimer

This archive is for research and educational purposes only. All documents are publicly available through official channels.

License

MIT License - See LICENSE file

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Epstein Archive

Overview

Live Instance

Features

Tech Stack

API Usage

Data Source

Disclaimer

License

About

Uh oh!

Releases

Packages

License

raya-ac/epstein-archive

Folders and files

Latest commit

History

Repository files navigation

Epstein Archive

Overview

Live Instance

Features

Tech Stack

API Usage

Data Source

Disclaimer

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages