Skip to content

pentaxis93/oracy

Repository files navigation

Oracy

AI-powered voice memo transcription for mobile.

Overview

Oracy is a full-stack voice transcription system that lets users record audio on their phone and get instant transcripts powered by OpenAI Whisper. The system consists of:

  • Client App (Flutter) - Android/Web/Desktop app for recording and viewing transcripts
  • Server API (FastAPI/Python) - REST API handling transcription, storage, and search
  • PostgreSQL + pgvector - Hybrid search with full-text and semantic search support
┌─────────────────────────────────────────────────────────────────────┐
│                      Client App (Flutter)                           │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐ │
│  │  Recording  │  │  History    │  │  Offline    │  │   Home     │ │
│  │  + Playback │  │  + Search   │  │  Queue      │  │  Widget    │ │
│  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘ │
└────────────────────────────┬────────────────────────────────────────┘
                             │ HTTPS (Bearer Auth)
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    Caddy (TLS Termination)                          │
└────────────────────────────┬────────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    Oracy API (FastAPI)                              │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐ │
│  │  /transcribe│  │  /transcripts│  │  /search   │  │  /usage    │ │
│  │  POST audio │  │  GET list   │  │  FTS+Vector │  │  stats     │ │
│  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘ │
└────────────────────────────┬────────────────────────────────────────┘
                             │
          ┌──────────────────┼──────────────────┐
          ▼                  ▼                  ▼
┌─────────────────┐ ┌─────────────────┐ ┌──────────────────┐
│  OpenAI Whisper │ │  PostgreSQL 16  │ │  Audit Logging   │
│  Transcription  │ │  + pgvector     │ │                  │
└─────────────────┘ └─────────────────┘ └──────────────────┘

Features

Client App

  • One-tap recording with visual amplitude feedback
  • Background sync queue for offline resilience
  • Local SQLite storage for transcript history
  • Android home screen widget for quick access
  • Secure API key storage (Keychain/Keystore)

Server API

  • Audio transcription via OpenAI Whisper (whisper-1)
  • Bearer token authentication with rate limiting
  • Full-text search with PostgreSQL tsvector
  • Semantic search with pgvector embeddings
  • Comprehensive audit logging
  • Usage tracking and cost reporting

Supported Audio Formats

mp3, mp4, m4a, wav, webm, opus (max 25MB)

Quick Start

Prerequisites

  • Python 3.11+
  • Flutter 3.10+
  • Docker and Docker Compose
  • OpenAI API key

1. Clone Repository

git clone https://github.com/pentaxis93/oracy.git
cd oracy/server
uv sync

2. Start the Server

Option A: Production Mode (Recommended)

One command installs everything, runs migrations, and starts the server:

uv run oracy server install

You'll be prompted for your OpenAI API key. The server auto-starts on boot.

Management commands:

uv run oracy server status     # Check server status
uv run oracy server logs -f    # View logs (follow mode)
uv run oracy server stop       # Stop the server
uv run oracy server start      # Start the server
uv run oracy server update     # Pull latest code and rebuild
uv run oracy server uninstall  # Remove server and data

The API will be available at http://localhost:8001

Option B: Development Mode

For local development with hot-reload:

# Start just PostgreSQL
docker compose up -d postgres

# Configure environment
cp .env.example .env
# Edit .env - set OPENAI_API_KEY

# Run migrations
uv run alembic -c database/alembic.ini upgrade head

# Start dev server (hot-reload, auto-kills stale processes)
uv run oracy dev

The API will be available at http://localhost:8000

3. Generate an API Key

uv run oracy admin generate-key -n "My API Key"

Save the displayed key! It's shown only once and cannot be retrieved later.

4. Setup Client App

cd client/flutter
flutter pub get

# Run on your platform
flutter run                    # Auto-detect platform
flutter run -d chrome          # Web
flutter run -d linux           # Linux desktop
flutter run -d windows         # Windows desktop
flutter run -d <device_id>     # Specific Android device

On first launch, go to Settings and enter your API key.

Platform-specific prerequisites:

  • Android: Android Studio with SDK
  • Linux: sudo apt install clang cmake ninja-build libgtk-3-dev
  • Windows: Visual Studio 2022 with "Desktop development with C++" workload
  • Web: Chrome browser

Project Structure

oracy/
├── server/                     # Python FastAPI server
│   ├── oracy/                  # Main application package
│   │   ├── api/                # REST API routes
│   │   │   ├── routes/         # Endpoint handlers
│   │   │   ├── schemas.py      # Pydantic models
│   │   │   ├── deps.py         # Dependency injection
│   │   │   └── audit.py        # Audit logging middleware
│   │   ├── core/               # Core utilities
│   │   │   ├── config.py       # Configuration management
│   │   │   ├── security.py     # Auth and key hashing
│   │   │   ├── logging.py      # Structured logging
│   │   │   └── rate_limit.py   # Rate limiting
│   │   ├── db/                 # Database layer
│   │   │   ├── models.py       # SQLAlchemy models
│   │   │   └── session.py      # Session management
│   │   ├── services/           # Business logic
│   │   │   └── whisper.py      # OpenAI Whisper integration
│   │   └── main.py             # FastAPI app entry point
│   ├── database/               # Schema and migrations
│   │   ├── schema.sql          # Raw SQL schema
│   │   ├── alembic/            # Migration scripts
│   │   └── README.md           # Database documentation
│   ├── tests/                  # pytest test suite
│   ├── Dockerfile              # Production container
│   └── pyproject.toml          # Dependencies (uv/pip)
│
├── client/                     # Client implementations
│   └── flutter/                # Flutter app (Android/Web/Linux/Windows)
│   ├── lib/
│   │   ├── main.dart           # App entry point
│   │   ├── app.dart            # Root widget and home page
│   │   ├── screens/            # Screen widgets
│   │   │   ├── history_screen.dart
│   │   │   ├── settings_screen.dart
│   │   │   └── transcript_result_screen.dart
│   │   ├── services/           # Business logic
│   │   │   ├── api_client.dart     # HTTP client + auth
│   │   │   ├── recording_service.dart
│   │   │   ├── transcription_service.dart
│   │   │   ├── history_service.dart
│   │   │   ├── upload_queue_service.dart
│   │   │   └── background_sync_service.dart
│   │   ├── db/                 # Local Drift database
│   │   └── widgets/            # Reusable components
│   ├── android/                # Android platform code
│   ├── linux/                  # Linux desktop platform code
│   ├── windows/                # Windows desktop platform code
│   ├── web/                    # Web platform code
│   └── pubspec.yaml            # Flutter dependencies
│
├── deploy/                     # Deployment configuration
│   ├── Caddyfile               # Reverse proxy config
│   ├── oracy.service           # systemd service file
│   ├── DEPLOYMENT.md           # Full deployment guide
│   └── DNS-SETUP.md            # DNS configuration
│
├── docker-compose.yml          # Development infrastructure
└── README.md                   # This file

API Reference

Authentication

All API endpoints (except /health) require Bearer token authentication:

curl -H "Authorization: Bearer oracy_sk_xxx" https://api.oracy.app/api/v1/transcripts

Endpoints

Method Path Description Rate Limit
GET /health Health check -
GET /api/v1/me Current API key info 100/min
POST /api/v1/transcribe Upload audio for transcription 10/min
GET /api/v1/transcripts List transcripts (paginated) 100/min
GET /api/v1/transcripts/{id} Get single transcript 100/min
GET /api/v1/usage Usage statistics 100/min

Interactive Documentation

When the server is running:

Example: Transcribe Audio

curl -X POST https://api.oracy.app/api/v1/transcribe \
  -H "Authorization: Bearer oracy_sk_xxx" \
  -F "file=@recording.m4a"

Response:

{
  "id": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "transcript": "Hello, this is a test recording...",
  "audio_duration_seconds": 12.5,
  "audio_format": "m4a",
  "cost_cents": 1,
  "created_at": "2026-01-20T12:00:00Z"
}

Configuration

Server Environment Variables

Variable Description Default
ORACY_DATABASE_URL PostgreSQL connection string Required
ORACY_OPENAI_API_KEY OpenAI API key Required
ORACY_LOG_LEVEL Logging level (DEBUG, INFO, etc.) INFO
ORACY_JSON_LOGS Use JSON log format true

Client App Configuration

The client app is configured at runtime:

  1. Open Settings
  2. Enter your API key
  3. (Optional) Change API base URL for self-hosted instances

Default API URL: https://api.oracy.app

Deployment

Documentation

Component Guide
Server deploy/DEPLOYMENT.md
DNS Setup deploy/DNS-SETUP.md
Flutter Build client/flutter/BUILD.md
API Contract docs/API_CONTRACT.md

Server Deployment

  1. Server: Docker Compose with PostgreSQL + pgvector
  2. TLS: Caddy for automatic HTTPS via Let's Encrypt
  3. DNS: A record pointing api.oracy.app to your server
# On production server
git clone https://github.com/pentaxis93/oracy.git /opt/oracy
cd /opt/oracy

# Configure environment
cp server/.env.example .env
# Edit .env with production values

# Start services
docker compose up -d --build

# Run migrations
docker compose exec server alembic upgrade head

Client App Release

cd client/flutter

# Android APK
flutter build apk --release

# Web (static files)
flutter build web --release

# Linux desktop
flutter build linux --release

# Windows desktop
flutter build windows --release

See client/flutter/BUILD.md for full build and release instructions.

Development

Running Tests

cd server
uv run pytest

Code Style

Server (Python):

cd server
uv run ruff check .
uv run ruff format .

Client App (Flutter):

cd client/flutter
dart analyze
dart format .

Database Migrations

cd server

# Create a new migration
uv run alembic -c database/alembic.ini revision --autogenerate -m "Description"

# Apply migrations
uv run alembic -c database/alembic.ini upgrade head

# Rollback one migration
uv run alembic -c database/alembic.ini downgrade -1

Testing

Server Unit Tests

cd server
uv run pytest

E2E Integration Tests

cd server
uv run pytest tests/test_e2e_integration.py -v

Manual E2E Test

# Against local server
./scripts/e2e_test.sh

# Against production
BASE_URL=https://api.oracy.app ORACY_API_KEY=xxx ./scripts/e2e_test.sh

Troubleshooting

Client App: "API Key Required" error

Go to Settings and enter a valid API key generated from the server.

Server: Database connection failed

Ensure PostgreSQL is running: docker compose ps postgres

Transcription returns 502

Check OpenAI API key is valid and has credit. View server logs: docker compose logs -f server

Rate limit exceeded

Wait for the limit window to reset (usually 1 minute). Consider upgrading your API key's rate limit.

Client App: Build errors

flutter clean
flutter pub get

Debugging

Server logs:

docker compose logs -f server

Client app debug mode:

flutter run --debug

License

MIT

About

Mobile-first voice capture with Whisper transcription

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors