Join our Discord for discussions, feature requests, and community support.
modelrelay is an OpenAI-compatible local router that benchmarks free coding models across top providers and automatically forwards your requests to the best available model.
- πΈ Completely Free: Stop paying for API usage. We seamlessly provide access to robust free models.
- π§ State-of-the-Art (SOTA) Models: Out-of-the-box availability for top-tier models including Kimi K2.5, Minimax M2.5, GLM 5, Deepseek V3.2, and more.
- π’ Reliable Providers: We route requests securely through trusted, high-performance platforms like NVIDIA, Groq, OpenRouter, and Google.
- β‘ Lightning Fast: The built-in benchmark continually evaluates metrics to pick the fastest and most capable LLM for your request.
- π OpenAI-Compatible: A perfect drop-in replacement that works seamlessly with your existing tools, scripts, and workflows.
npm install -g modelrelay
# Start it
modelrelayOnce started, modelrelay is accessible at http://localhost:7352/.
Router endpoint:
- Base URL:
http://127.0.0.1:7352/v1 - API key: any string
- Model:
auto-fastest(router picks actual backend)
- Docker Engine
- Docker Compose (the
docker composecommand)
mkdir modelrelay
cd modelrelay
curl -fsSL -o Dockerfile https://raw.githubusercontent.com/ellipticmarketing/modelrelay/master/Dockerfile
curl -fsSL -o docker-compose.yml https://raw.githubusercontent.com/ellipticmarketing/modelrelay/master/docker-compose.yml
docker compose up -d --buildOnce running, modelrelay is accessible at http://localhost:7352/.
Use modelrelay onboard to save provider keys and auto-configure integrations for OpenClaw or OpenCode.
modelrelay onboardIf you prefer manual setup, use the examples below.
modelrelay onboard can auto-configure OpenCode.
If you want manual setup, put this in ~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"router": {
"npm": "@ai-sdk/openai-compatible",
"name": "modelrelay",
"options": {
"baseURL": "http://127.0.0.1:7352/v1",
"apiKey": "dummy-key"
},
"models": {
"auto-fastest": {
"name": "Auto Fastest"
}
}
}
},
"model": "router/auto-fastest"
}modelrelay onboard can auto-configure OpenClaw.
If you want manual setup, merge this into ~/.openclaw/openclaw.json:
{
"models": {
"providers": {
"modelrelay": {
"baseUrl": "http://127.0.0.1:7352/v1",
"api": "openai-completions",
"apiKey": "no-key",
"models": [
{ "id": "auto-fastest", "name": "Auto Fastest" }
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "modelrelay/auto-fastest"
},
"models": {
"modelrelay/auto-fastest": {}
}
}
}
}modelrelay [--port <number>] [--log] [--ban <model1,model2>]
modelrelay onboard [--port <number>]
modelrelay install --autostart
modelrelay start --autostart
modelrelay uninstall --autostart
modelrelay status --autostart
modelrelay update
modelrelay autoupdate [--enable|--disable|--status] [--interval <hours>]
modelrelay autostart [--install|--start|--uninstall|--status]
modelrelay config export
modelrelay config import <token>Request terminal logging is disabled by default. Use --log to enable it.
modelrelay install --autostart also triggers an immediate start attempt so you do not need a separate command after install.
During modelrelay onboard, you will also be prompted to enable auto-start on login.
modelrelay update upgrades the global npm package and, when autostart is configured, stops the background service first and starts it again after the update.
Auto-update is enabled by default. While the router is running, modelrelay checks npm periodically (default: every 24 hours) and applies updates automatically.
Use modelrelay autoupdate --status to inspect state, modelrelay autoupdate --disable to turn it off, and modelrelay autoupdate --enable --interval 12 to re-enable with a custom interval.
Use modelrelay config export to print a transferable config token (base64url-encoded JSON), and modelrelay config import <token> to load it on another machine.
You can also import by stdin:
modelrelay config export | modelrelay config importPOST /v1/chat/completions is an OpenAI-compatible chat completions endpoint.
- Use
model: "auto-fastest"to route to the best model overall - Use a grouped model ID such as
minimax-m2.5,kimi-k2.5, orglm4.7to route within that model group - For grouped IDs, modelrelay selects the provider with the best current QoS for that group
- Streaming and non-streaming requests are both supported
GET /v1/models returns the models exposed by the router.
- Model IDs are grouped slugs such as
minimax-m2.5,kimi-k2.5, andglm4.7 - Each grouped ID can represent the same model across multiple providers
- When you select one of these IDs in
/v1/chat/completions, modelrelay routes the request to the provider with the best current QoS for that model group auto-fastestis also exposed and routes to the best model overall
Example:
{
"object": "list",
"data": [
{ "id": "auto-fastest", "object": "model", "owned_by": "router" },
{ "id": "minimax-m2.5", "object": "model", "owned_by": "relay" },
{ "id": "kimi-k2.5", "object": "model", "owned_by": "relay" },
{ "id": "glm4.7", "object": "model", "owned_by": "relay" }
]
}- Router config file:
~/.modelrelay.json - API key env overrides:
NVIDIA_API_KEYGROQ_API_KEYCEREBRAS_API_KEYSAMBANOVA_API_KEYOPENROUTER_API_KEYCODESTRAL_API_KEYHYPERBOLIC_API_KEYSCALEWAY_API_KEYQWEN_CODE_API_KEY(orDASHSCOPE_API_KEY)GOOGLE_API_KEY
For Qwen Code, modelrelay supports both API keys and Qwen OAuth cached credentials (~/.qwen/oauth_creds.json).
If OAuth credentials exist, modelrelay will use them and refresh access tokens automatically.
You can also start OAuth directly from the Web UI Providers tab using Login with Qwen Code.
- In the Web UI, open
Settings->Configuration Transferto export/copy/import a token. - The token includes your full config (including API keys, provider toggles, bans, filter rules, and auto-update settings).
- Treat tokens as secrets. Anyone with the token can import your keys/settings.
- Alternative: copy the config file directly from
~/.modelrelay.jsonto the other machine at the same path (~/.modelrelay.json).
To trigger a manual npm update and restart the service, run:
npm i -g modelrelay@latest
modelrelay autostart --startYou can point the updater at a local tarball instead of the npm registry:
npm pack
MODELRELAY_UPDATE_TARBALL=./modelrelay-1.8.3.tgz pnpm startIf you want the Web UI to always show an update while testing, set a higher forced version:
MODELRELAY_FORCE_UPDATE_VERSION=9.9.9If the tarball filename does not contain a semantic version, also set:
MODELRELAY_UPDATE_VERSION=1.8.3When MODELRELAY_UPDATE_TARBALL is set, the Web UI update flow and modelrelay update
install from that tarball and bypass the normal Git checkout update block. This is for
local testing only. MODELRELAY_FORCE_UPDATE_VERSION only affects version detection; the
actual install still comes from the tarball path.
βοΈ If you find modelrelay useful, please consider starring the repo!
