Skip to content

A lightweight, pluggable proxy layer for managing access, policies, and limits across LLM providers.

Notifications You must be signed in to change notification settings

mehmetymw/brighteye

Repository files navigation

Brighteye

Brighteye Logo

A lightweight, pluggable proxy layer for managing access, policies, and limits across LLM providers

FeaturesInstallationQuick StartConfigurationIntegrationsDocumentation

🚀 Features

  • 🔐 Advanced Access Control: API key-based authentication and fine-grained access policies
  • 📊 Usage Tracking: Monitor token consumption with per-user and per-team quotas
  • Rate Limiting: Define rate limits for users, teams, and LLM models
  • 🦦 Lightweight: Minimal footprint (~38MB Docker image)
  • 🧹 Prompt Filtering: Blocklist and regex-based input sanitation
  • 📈 Real-time Metrics: Prometheus-compatible metrics for observability
  • 🔌 Provider Agnostic: Supports OpenAI, Anthropic, and custom LLM APIs
  • 🛠️ Dynamic Configuration: Hot-reloads YAML configuration without needing to restart
  • 🔄 Zero-Downtime Reloads: Seamless config updates during runtime

🛠 Installation

Using Docker

# Pull the API container from GitHub Container Registry
docker pull ghcr.io/mehmetymw/brighteye:latest

# Run the API container with configuration file mounted
docker run -d \
  --name brighteye \
  -p 6789:6789 \
  -v $(pwd)/brighteye.yaml:/app/brighteye.yaml \
  ghcr.io/mehmetymw/brighteye:latest

Manual Build (Go)

# Clone the repository
git clone https://github.com/mehmetymw/brighteye
cd brighteye

# Build the backend binary
cd api
go mod download
go build -o brighteye

▶️ Quick Start

  1. Define your runtime configuration in brighteye.yaml
  2. Launch Brighteye with Docker or binary
  3. Proxy your LLM API calls through Brighteye
curl -X POST http://localhost:6789/chat/completions?provider=openai \
  -H "Authorization: Bearer sk-frontend-123" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

⚙️ Configuration: brighteye.yaml

Here’s the complete brighteye.yaml you provided, followed by a detailed explanation of each section:

version: "1.0.0"

entryPoints:
  web:
    address: 6789
  metrics:
    address: 6790

defaultPolicy:
  rateLimit:
    rpm: 100
    burst: 10
  tokenQuota:
    daily: 10000
  promptFilter:
    blocklist:
      - "harmful content"
      - "inappropriate request"
    regex:
      - "(?i)password"
      - "(?i)secret"
  maxPromptLength: 4000

providers:
  default: "openai"
  services:
    - name: "openai"
      base_url: "https://api.openai.com/v1"
      models:
        - name: "gpt-3.5-turbo"
        - name: "gpt-4"
        - name: "gpt-4-turbo"
      allowedPaths:
        - "/chat/completions"
        - "/completions"
    - name: "anthropic"
      base_url: "https://api.anthropic.com/v1"
      models:
        - name: "claude-3-haiku"
        - name: "claude-3-sonnet"
        - name: "claude-3-opus"
      allowedPaths:
        - "/messages"

keyGroups:
  - name: "development"
    apiKeys:
      - key: "YOUR-DEV-OPENAI-KEY"
        label: "Development Team"
      - key: "YOUR-DEV-CLAUDE-KEY"
        label: "Testing Environment"
    permissions:
      - provider: "openai"
        models: ["gpt-3.5-turbo", "gpt-4"]
      - provider: "anthropic"
        models: ["claude-3-haiku"]
    rateLimit:
      rpm: 50
      burst: 5
    tokenQuota:
      daily: 5000

  - name: "production"
    apiKeys:
      - key: "YOUR-PROD-OPENAI-KEY"
        label: "Production OPENAI KEY"
      - key: "YOUR-PROD-CLAUDE-KEY"
        label: "Production CLAUDE KEY"
    permissions:
      - provider: "openai"
        models: ["gpt-3.5-turbo", "gpt-4", "gpt-4-turbo"]
      - provider: "anthropic"
        models: ["claude-3-haiku", "claude-3-sonnet", "claude-3-opus"]
    rateLimit:
      rpm: 200
      burst: 20
    tokenQuota:
      daily: 20000

Section Breakdown

Section Description
version Config file version for compatibility checks
entryPoints Ports Brighteye listens on:
web: API endpoint port
metrics: Prometheus metrics port
defaultPolicy Fallback policy if no group override:
rateLimit: requests per minute & burst
tokenQuota: daily token cap
promptFilter: blocklist & regex patterns
maxPromptLength: max characters per prompt
providers LLM provider definitions:
default: default service name
services: each service’s name, base_url, models, and allowed paths
keyGroups API key groups with scoped settings:
name: group label (development, production)
apiKeys: tokens & labels
permissions: allowed provider/model combos
rateLimit, tokenQuota: per-group overrides

Mount this file when running the container:

docker run -v $(pwd)/brighteye.yaml:/app/brighteye.yaml ...

🤝 Integrations

NGINX Reverse Proxy

location /llm/ {
  proxy_pass http://localhost:6789/;
  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
}

Traefik Integration

http:
  routers:
    brighteye:
      rule: "PathPrefix(`/llm`)"
      service: brighteye

  services:
    brighteye:
      loadBalancer:
        servers:
          - url: "http://localhost:6789"

🔐 Security Features

  • API Key Management: Associate keys with roles, teams, models, and quotas
  • Prompt Filtering: Block unsafe or injection-prone prompts with regex and terms
  • Quotas & Limits: Define per-user and per-model usage caps
  • Rate Limiting: Defend against abuse with custom RPS and burst configs
  • Audit Logs: Track usage patterns and violations

📊 Observability

Brighteye exposes comprehensive metrics via Prometheus on the /metrics endpoint. Under the hood, it tracks:

  • RequestsTotal (brighteye_requests_total): total requests received, labeled by provider & model
  • BlockedPrompts (brighteye_blocked_prompt_total): count of prompts blocked by filters
  • TokenUsage (brighteye_token_usage_total): tokens consumed per API key, model, and group
  • UpstreamErrors (brighteye_upstream_errors_total): upstream API error counts by provider
  • ResponseTime (brighteye_response_time_seconds): histogram of request processing times
  • RateLimitHits (brighteye_rate_limit_hits_total): count of rate-limit violations per API key
  • QuotaUsage (brighteye_quota_usage): gauge of current quota usage percentage per API key
  • ActiveConnections (brighteye_active_connections): gauge of concurrent connections
  • RequestSize (brighteye_request_size_bytes): histogram of incoming request sizes

🌟 Contributing

We welcome contributions of all kinds:

  1. Fork the repo
  2. Create a new branch: git checkout -b feature/my-feature
  3. Commit changes: git commit -m 'Implement my feature'
  4. Push to GitHub: git push origin feature/my-feature
  5. Open a Pull Request 🚀

📜 License

Brighteye is released under the MIT License. See LICENSE for full details.

About

A lightweight, pluggable proxy layer for managing access, policies, and limits across LLM providers.

Resources

Contributing

Stars

Watchers

Forks

Packages

No packages published