Skip to content

meow7758/llm-gateway

 
 

Repository files navigation

cover

Squirrel

Enterprise-Grade LLM Gateway

Unified API Proxy for OpenAI, Anthropic, and Compatible LLM Providers

English · 中文

Python 3.12+ FastAPI Next.js MIT License

Overview

Squirrel is a high-performance, production-ready proxy service that unifies access to multiple Large Language Model (LLM) providers. It acts as an intelligent gateway between your applications and LLM services, providing seamless failover, load balancing, comprehensive observability, and a modern management dashboard — now with first-class OpenAI Responses support and smooth protocol conversion across OpenAI Chat, OpenAI Responses, and Anthropic Messages.

Why Squirrel?

  • Single Integration Point: Connect once, access multiple LLM providers through a unified API
  • Zero Code Changes: Drop-in replacement compatible with OpenAI and Anthropic SDKs
  • Cost Optimization: Route requests intelligently across providers based on rules, priority, or cost
  • Production Ready: Built-in retry logic, failover mechanisms, and detailed request logging
  • Full Visibility: Track every request with token usage, latency metrics, and cost analytics

Key Features

Unified API Interface

  • OpenAI Compatible: Full support for /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/audio/*, /v1/images/*
  • OpenAI Responses Compatible: /v1/responses with streaming and tool-calls
  • Anthropic Compatible: Native support for /v1/messages endpoint
  • Protocol Conversion: Smoothly convert between OpenAI Chat ↔ OpenAI Responses ↔ Anthropic Messages (requests, responses, and streaming), powered by the built-in llm_api_converter
  • Streaming Support: Full Server-Sent Events (SSE) support for real-time responses

Intelligent Routing

  • Rule-Based Routing: Route requests based on model name, headers, message content, or token count
  • Load Balancing Strategies:
    • Round-Robin: Distribute requests evenly across providers
    • Priority-Based: Use preferred providers first, fallback to others
    • Weight-Based: Distribute by custom weight ratios
    • Cost-Based: Automatically select the lowest-priced model based on API pricing
  • Model Mapping: Map virtual model names to multiple backend providers

High Availability

  • Automatic Retries: Configurable retry attempts for server errors (HTTP 500+)
  • Provider Failover: Seamlessly switch to backup providers on failure
  • Timeout Management: Configurable request timeouts with long streaming support (default: 30 minutes)

Comprehensive Observability

  • Full Request/Response Capture: Complete logging of request and response bodies (including streaming) to help debug issues and optimize AI system performance
  • Token Tracking: Automatic token counting using tiktoken
  • Latency Metrics: First-byte delay and total response time
  • Cost Analytics: Aggregated statistics by time, model, provider, and API key
  • Data Sanitization: Automatic redaction of sensitive information in logs

Modern Dashboard

Built with Next.js 16 + TypeScript + shadcn/ui:

  • Provider management with connection testing
  • Model mapping configuration with rule editor
  • API key generation and lifecycle management
  • Advanced log viewer with multi-dimensional filtering
  • Cost statistics and usage analytics

Quick Start

Docker Compose (Recommended)

The fastest way to get started with PostgreSQL:

# Clone the repository
git clone https://github.com/mylxsw/llm-gateway.git
cd llm-gateway
# Start services
docker compose -f docker-compose.prod.yml up -d

Access the dashboard at http://localhost:8000 (or the port you set in LLM_GATEWAY_PORT)

Docker (Single Container)

Run with SQLite for simple deployments:

docker run -d \
  -p 8000:8000 \
  -v $(pwd)/data:/data \
  --name llm-gateway \
  ghcr.io/mylxsw/llm-gateway:latest

Manual Installation

Prerequisites

  • Python 3.12+
  • Node.js 18+
  • npm (for frontend)

Backend Setup

cd backend

# Install dependencies (choose one)
uv sync          # Recommended: using uv
pip install -r requirements.txt  # Or using pip

# Initialize database
alembic upgrade head

# Start server
uvicorn app.main:app --host 0.0.0.0 --port 8000

Frontend Setup

cd frontend

# Install dependencies
npm install

# Development
npm run dev

# Production build
npm run build && npm run start

Usage

Basic Configuration

  1. Add a Provider: Navigate to Providers page and add your LLM provider (e.g., OpenAI)

    • Set the base URL (e.g., https://api.openai.com/v1)
    • Add your API key
    • Select the protocol (OpenAI, OpenAI Responses, or Anthropic)
  2. Create Model Mapping: Go to Models page and create a mapping

    • Define a model name (e.g., gpt-4)
    • Associate it with one or more providers
    • Set routing priority/weight
  3. Generate API Key: Create a gateway API key in API Keys page

  4. Connect Your Application:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="lgw-your-gateway-api-key"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

OpenAI Responses example

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="lgw-your-gateway-api-key"
)

response = client.responses.create(
    model="gpt-4.1-mini",
    input="Summarize this in one sentence."
)

API Endpoints

Proxy Endpoints (OpenAI Compatible)

Method Endpoint Description
GET /v1/models List available models
POST /v1/chat/completions Chat completions
POST /v1/completions Text completions
POST /v1/embeddings Generate embeddings
POST /v1/audio/speech Text-to-speech
POST /v1/audio/transcriptions Speech-to-text
POST /v1/audio/translations Speech-to-text (translation)
POST /v1/images/generations Image generation
POST /v1/responses Responses API

Proxy Endpoints (Anthropic Compatible)

Method Endpoint Description
POST /v1/messages Messages API

Admin Endpoints

Resource Endpoints
Providers GET/POST /api/admin/providers, GET/PUT/DELETE /api/admin/providers/{id}
Models GET/POST /api/admin/models, GET/PUT/DELETE /api/admin/models/{model}
API Keys GET/POST /api/admin/api-keys, GET/PUT/DELETE /api/admin/api-keys/{id}
Logs GET /api/admin/logs, GET /api/admin/logs/stats

See docs/api.md for complete API documentation.


Configuration

Environment Variables

Variable Default Description
APP_NAME LLM Gateway Application name
DEBUG false Enable debug mode
DATABASE_TYPE sqlite Database type: sqlite or postgresql
DATABASE_URL sqlite+aiosqlite:///./llm_gateway.db Database connection string
RETRY_MAX_ATTEMPTS 3 Max retry attempts for 500+ errors
RETRY_DELAY_MS 1000 Delay between retries (milliseconds)
HTTP_TIMEOUT 1800 Upstream request timeout (seconds)
API_KEY_PREFIX lgw- Prefix for generated API keys
API_KEY_LENGTH 32 Length of generated API keys
ADMIN_USERNAME - Admin login username (optional)
ADMIN_PASSWORD - Admin login password (optional)
ADMIN_TOKEN_TTL_SECONDS 86400 Admin session TTL (24 hours)
LOG_RETENTION_DAYS 7 Log retention period
LOG_CLEANUP_HOUR 4 Log cleanup time (UTC hour)
LLM_GATEWAY_PORT 8000 Host port for Docker Compose
KV_STORE_TYPE database KV store backend: database or redis
REDIS_URL - Redis connection URL (when using Redis KV store)

Database Configuration

SQLite (default, simple deployments):

DATABASE_TYPE=sqlite
DATABASE_URL=sqlite+aiosqlite:///./llm_gateway.db

PostgreSQL (recommended for production):

DATABASE_TYPE=postgresql
DATABASE_URL=postgresql+asyncpg://user:password@localhost:5432/llm_gateway

Supported Providers

Squirrel can proxy requests to any OpenAI or Anthropic compatible API:

Provider Protocol Notes
OpenAI OpenAI Full support including GPT-4, GPT-3.5, embeddings, audio, images
OpenAI OpenAI Responses Responses API via /v1/responses
Anthropic Anthropic Claude models via Messages API
Azure OpenAI OpenAI Use Azure endpoint URL
Local Models OpenAI Ollama, vLLM, LocalAI, etc.
Other Providers OpenAI/Anthropic Any compatible API endpoint

Development

Project Structure

llm-gateway/
├── backend/
│   ├── app/
│   │   ├── api/           # API routes (proxy, admin)
│   │   ├── services/      # Business logic
│   │   ├── providers/     # Protocol adapters
│   │   ├── repositories/  # Data access layer
│   │   ├── db/            # Database models
│   │   ├── domain/        # DTOs and domain models
│   │   ├── rules/         # Rule evaluation engine
│   │   └── common/        # Utilities
│   ├── migrations/        # Alembic migrations
│   └── tests/             # Test suite
├── llm_api_converter/     # Protocol conversion SDK (OpenAI/Responses/Anthropic)
├── frontend/
│   └── src/
│       ├── app/           # Next.js App Router pages
│       ├── components/    # React components
│       └── lib/           # Utilities and API client
├── docker-compose.yml
└── Dockerfile

Running Tests

cd backend
pytest

Database Migrations

cd backend

# Create new migration
alembic revision --autogenerate -m "description"

# Apply migrations
alembic upgrade head

Documentation


License

MIT


Made with care for the LLM community

About

Squirrel LLM Gateway is a high-performance, enterprise-grade proxy service designed to unify and manage access to Large Language Model (LLM) providers like OpenAI and Anthropic. It provides intelligent routing, robust failover strategies, comprehensive logging, and a modern management dashboard.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 73.6%
  • TypeScript 25.4%
  • Shell 0.7%
  • CSS 0.2%
  • Dockerfile 0.1%
  • JavaScript 0.0%