Skip to content

OrbitingBucket/another-ocr

Repository files navigation

Another OCR - Full Stack Application

Description

Another OCR is a full-stack web application built for performing Optical Character Recognition (OCR) on uploaded documents (PDFs, images). It features user authentication, credit-based usage tracking, multiple OCR engine options (Mistral, Pixtral, Qwen via OpenRouter), and administrative capabilities.

This version is designed for deployment on a standard VPS using Docker Compose.

Architecture Overview

The application utilizes a monorepo structure managed by npm workspaces and runs as containerized services orchestrated by Docker Compose:

  • client: A React frontend built with Vite, providing the user interface. (Uses Nginx for serving static files in production build).
  • server: An Express.js backend handling OCR processing requests, interacting with external AI services (Mistral, OpenRouter), and managing user credits via the shared library.
  • auth-service: An Express.js backend dedicated to user authentication (registration, login) and token generation, interacting with the database via the shared library.
  • shared-lib: A common library containing shared logic for database interactions (Prisma + Postgres), JWT handling, password hashing, and potentially shared types.
  • postgres: A PostgreSQL database container for storing user data and credits.

Tech Stack

  • Frontend: React, Vite, TypeScript, TailwindCSS, Axios
  • Backend (Server & Auth): Node.js, Express, TypeScript
  • Database: PostgreSQL
  • ORM: Prisma
  • Containerization: Docker, Docker Compose
  • External APIs: Mistral AI, OpenRouter (for Qwen models)

Prerequisites

  • Docker (Install Docker)
  • Docker Compose (usually included with Docker Desktop)
  • Node.js & npm (for local development commands, if not using dev container exclusively)
  • Git

Environment Setup

  1. Clone the Repository:

    git clone <your-repository-url>
    cd another-ocr
  2. Create Environment File:

    • Copy the example environment file (if you create one) or create a .env file in the project root.
    • Populate it with the necessary secrets and configuration:
    # .env
    
    # --- DATABASE Configuration ---
    # Credentials for the PostgreSQL database container
    POSTGRES_USER=admin # Or your chosen user
    POSTGRES_PASSWORD=your_strong_password # CHANGE THIS!
    POSTGRES_DB=another_ocr_db # Or your chosen DB name
    
    # --- SERVICE Configuration ---
    # Full database connection string (uses credentials above)
    # NOTE: Host is 'postgres' (the docker-compose service name) when running inside docker
    DATABASE_URL="postgresql://admin:your_strong_password@postgres:5432/another_ocr_db?schema=public"
    
    # Secret key for signing JWT tokens (generate a strong random string)
    JWT_SECRET=your_very_strong_random_jwt_secret_string # CHANGE THIS!
    
    # URL of the deployed frontend (for CORS configuration in auth-service)
    # For local dev using docker compose, this is usually the host mapping
    FRONTEND_URL="http://localhost:80" # CHANGE PORT if using 'serve' client (5173) or different host mapping
    
    # API Keys for external services
    MISTRAL_API_KEY=your_mistral_api_key
    OPENROUTER_API_KEY=your_openrouter_api_key
    
    # --- Variables for Client Build/Runtime (if needed by client) ---
    # These are used by Vite during build or potentially at runtime if not build-time only
    # Ensure they match the HOST accessible ports if accessed from browser
    VITE_API_URL=http://localhost:4000
    VITE_AUTH_API_URL=http://localhost:5000

    Important: Replace placeholder values, especially passwords and secrets, with strong, unique values.

  3. Install Dependencies:

    npm install

Running Locally (Development Mode)

This uses nodemon for backend hot-reloading and Vite's dev server for the client. Source code is mounted into containers.

  1. Modify docker-compose.yml (Optional but Recommended for Dev):

    • Uncomment the volumes: mounts for client, server, auth-service that map ./ to /workspace.
    • Change the command: for client, server, and auth-service back to use npm run dev.
    • Adjust the client port mapping back to 5173:5173 if you switched to the Nginx production setup.
    • Ensure NODE_ENV is not set to production in the environment sections (or set it explicitly to development).
  2. Build Images (if first time or Dockerfiles changed):

    docker compose build
  3. Run Services:

    docker compose up -d

    (The -d runs containers in the background)

  4. Apply Database Migrations (if first time):

    • Wait a few seconds for Postgres to initialize.
    • Run:
      docker compose run --rm shared-lib npx prisma migrate dev --name initial_migration --schema=/workspace/shared-lib/prisma/schema.prisma
      (Adjust --name if needed. The --schema path assumes running relative to the mounted workspace)
  5. Seed Admin User (if first time):

    docker compose run --rm shared-lib node /workspace/shared-lib/dist/scripts/seedAdmin.js
  6. Access Services:

    • Client: http://localhost:5173 (or your mapped port)
    • Server API: http://localhost:4000
    • Auth API: http://localhost:5000
  7. View Logs:

    docker compose logs -f <service_name> # e.g., client, server, auth-service
  8. Stop Services:

    docker compose down

Building for Production

This creates optimized images with only runtime dependencies and compiled code.

  1. Ensure Production Settings in docker-compose.yml:

    • Volume mounts for source code should be REMOVED/commented out.
    • command: should NOT be overridden (use default CMD from Dockerfiles).
    • NODE_ENV=production should be set for backend services.
    • Client port mapping should match the production setup (e.g., 80:80 for Nginx).
  2. Build Images:

    docker compose build

Running in Production (VPS Example)

These are general steps; adapt them to your specific VPS provider and setup.

  1. Install Docker & Docker Compose on VPS.
  2. Clone Repository on VPS.
  3. Create .env file on VPS: Populate it with your production secrets and configuration (database URL pointing to postgres:5432, production FRONTEND_URL, strong secrets/keys). Do not commit the production .env file to Git. Secure this file appropriately on the server.
  4. Build Images on VPS (Recommended):
    docker compose build
    (Alternatively, build locally and push to a container registry like Docker Hub or GitHub Container Registry, then pull on the VPS)
  5. Start Services:
    docker compose up -d
  6. Apply Database Migrations:
    docker compose run --rm server npx prisma migrate deploy --schema=/app/prisma/schema.prisma
    (Use migrate deploy for production. Adjust --schema path if needed based on production WORKDIR)
  7. Seed Admin User (if fresh database):
    docker compose run --rm server node /app/node_modules/@another-ocr/shared-lib/dist/scripts/seedAdmin.js
    (Run via one of the service containers)
  8. Configure Reverse Proxy (Nginx on VPS):
    • Install Nginx on the VPS.
    • Configure Nginx to listen on ports 80/443.
    • Set up SSL/TLS certificates (e.g., using Let's Encrypt / Certbot).
    • Proxy requests to the appropriate Docker containers based on hostname/path (e.g., /api/auth to http://localhost:5000, /api to http://localhost:4000, / to http://localhost:80 for the client).

Key Environment Variables

  • POSTGRES_USER: PostgreSQL username.
  • POSTGRES_PASSWORD: PostgreSQL password.
  • POSTGRES_DB: PostgreSQL database name.
  • DATABASE_URL: Full connection string for Prisma (used by server and auth-service).
  • JWT_SECRET: Secret key for signing authentication tokens.
  • FRONTEND_URL: URL of the deployed frontend (for CORS).
  • MISTRAL_API_KEY: API key for Mistral AI.
  • OPENROUTER_API_KEY: API key for OpenRouter.
  • PORT (for server/auth-service containers, usually set in compose file).
  • VITE_API_URL, VITE_AUTH_API_URL (build-time for client, ensure they resolve correctly in production).

API Testing

API tests can be run using a tool like Postman, Insomnia, or the VS Code REST Client extension with the api-tests.http file. Remember to obtain a JWT token from the /api/auth/login endpoint first and include it as a Bearer token in the Authorization header for protected endpoints.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published