A lightweight, pluggable proxy layer for managing access, policies, and limits across LLM providers
Features • Installation • Quick Start • Configuration • Integrations • Documentation
- 🔐 Advanced Access Control: API key-based authentication and fine-grained access policies
- 📊 Usage Tracking: Monitor token consumption with per-user and per-team quotas
- ⚡ Rate Limiting: Define rate limits for users, teams, and LLM models
- 🦦 Lightweight: Minimal footprint (~38MB Docker image)
- 🧹 Prompt Filtering: Blocklist and regex-based input sanitation
- 📈 Real-time Metrics: Prometheus-compatible metrics for observability
- 🔌 Provider Agnostic: Supports OpenAI, Anthropic, and custom LLM APIs
- 🛠️ Dynamic Configuration: Hot-reloads YAML configuration without needing to restart
- 🔄 Zero-Downtime Reloads: Seamless config updates during runtime
# Pull the API container from GitHub Container Registry
docker pull ghcr.io/mehmetymw/brighteye:latest
# Run the API container with configuration file mounted
docker run -d \
--name brighteye \
-p 6789:6789 \
-v $(pwd)/brighteye.yaml:/app/brighteye.yaml \
ghcr.io/mehmetymw/brighteye:latest# Clone the repository
git clone https://github.com/mehmetymw/brighteye
cd brighteye
# Build the backend binary
cd api
go mod download
go build -o brighteye- Define your runtime configuration in
brighteye.yaml - Launch Brighteye with Docker or binary
- Proxy your LLM API calls through Brighteye
curl -X POST http://localhost:6789/chat/completions?provider=openai \
-H "Authorization: Bearer sk-frontend-123" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'Here’s the complete brighteye.yaml you provided, followed by a detailed explanation of each section:
version: "1.0.0"
entryPoints:
web:
address: 6789
metrics:
address: 6790
defaultPolicy:
rateLimit:
rpm: 100
burst: 10
tokenQuota:
daily: 10000
promptFilter:
blocklist:
- "harmful content"
- "inappropriate request"
regex:
- "(?i)password"
- "(?i)secret"
maxPromptLength: 4000
providers:
default: "openai"
services:
- name: "openai"
base_url: "https://api.openai.com/v1"
models:
- name: "gpt-3.5-turbo"
- name: "gpt-4"
- name: "gpt-4-turbo"
allowedPaths:
- "/chat/completions"
- "/completions"
- name: "anthropic"
base_url: "https://api.anthropic.com/v1"
models:
- name: "claude-3-haiku"
- name: "claude-3-sonnet"
- name: "claude-3-opus"
allowedPaths:
- "/messages"
keyGroups:
- name: "development"
apiKeys:
- key: "YOUR-DEV-OPENAI-KEY"
label: "Development Team"
- key: "YOUR-DEV-CLAUDE-KEY"
label: "Testing Environment"
permissions:
- provider: "openai"
models: ["gpt-3.5-turbo", "gpt-4"]
- provider: "anthropic"
models: ["claude-3-haiku"]
rateLimit:
rpm: 50
burst: 5
tokenQuota:
daily: 5000
- name: "production"
apiKeys:
- key: "YOUR-PROD-OPENAI-KEY"
label: "Production OPENAI KEY"
- key: "YOUR-PROD-CLAUDE-KEY"
label: "Production CLAUDE KEY"
permissions:
- provider: "openai"
models: ["gpt-3.5-turbo", "gpt-4", "gpt-4-turbo"]
- provider: "anthropic"
models: ["claude-3-haiku", "claude-3-sonnet", "claude-3-opus"]
rateLimit:
rpm: 200
burst: 20
tokenQuota:
daily: 20000| Section | Description |
|---|---|
version |
Config file version for compatibility checks |
entryPoints |
Ports Brighteye listens on: • web: API endpoint port• metrics: Prometheus metrics port |
defaultPolicy |
Fallback policy if no group override: • rateLimit: requests per minute & burst • tokenQuota: daily token cap • promptFilter: blocklist & regex patterns • maxPromptLength: max characters per prompt |
providers |
LLM provider definitions: • default: default service name• services: each service’s name, base_url, models, and allowed paths |
keyGroups |
API key groups with scoped settings: • name: group label ( development, production)• apiKeys: tokens & labels • permissions: allowed provider/model combos • rateLimit, tokenQuota: per-group overrides |
Mount this file when running the container:
docker run -v $(pwd)/brighteye.yaml:/app/brighteye.yaml ...location /llm/ {
proxy_pass http://localhost:6789/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}http:
routers:
brighteye:
rule: "PathPrefix(`/llm`)"
service: brighteye
services:
brighteye:
loadBalancer:
servers:
- url: "http://localhost:6789"- API Key Management: Associate keys with roles, teams, models, and quotas
- Prompt Filtering: Block unsafe or injection-prone prompts with regex and terms
- Quotas & Limits: Define per-user and per-model usage caps
- Rate Limiting: Defend against abuse with custom RPS and burst configs
- Audit Logs: Track usage patterns and violations
Brighteye exposes comprehensive metrics via Prometheus on the /metrics endpoint. Under the hood, it tracks:
- RequestsTotal (
brighteye_requests_total): total requests received, labeled by provider & model - BlockedPrompts (
brighteye_blocked_prompt_total): count of prompts blocked by filters - TokenUsage (
brighteye_token_usage_total): tokens consumed per API key, model, and group - UpstreamErrors (
brighteye_upstream_errors_total): upstream API error counts by provider - ResponseTime (
brighteye_response_time_seconds): histogram of request processing times - RateLimitHits (
brighteye_rate_limit_hits_total): count of rate-limit violations per API key - QuotaUsage (
brighteye_quota_usage): gauge of current quota usage percentage per API key - ActiveConnections (
brighteye_active_connections): gauge of concurrent connections - RequestSize (
brighteye_request_size_bytes): histogram of incoming request sizes
We welcome contributions of all kinds:
- Fork the repo
- Create a new branch:
git checkout -b feature/my-feature - Commit changes:
git commit -m 'Implement my feature' - Push to GitHub:
git push origin feature/my-feature - Open a Pull Request 🚀
Brighteye is released under the MIT License. See LICENSE for full details.
