OpenRouter MCP Multimodal Server

The all-in-one MCP server for 300+ LLMs — text, vision, audio, and video in a single package.

_{3,800+ installs across npm + Docker Hub · ~950 npm installs/month and accelerating}

Install · Tools · Quick Start · Config · Examples · Architecture · Changelog

Access 300+ LLMs through OpenRouter via the Model Context Protocol. Analyze images, audio, and video. Generate images, audio, and video. Chat with any model. Every tool returns structured _meta.code errors so MCP clients can switch on failure modes without parsing strings.

One-Click Install

Kiro
Cursor
VS Code
VS Code Insiders
Claude Desktop	Install Guide — Add to `claude_desktop_config.json`
Windsurf	Install Guide — Add to `~/.codeium/windsurf/mcp_config.json`
Cline	Install Guide — Add via Cline MCP settings
Smithery	`npx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude`

After clicking, the target client opens a confirmation prompt. You'll need to paste your OPENROUTER_API_KEY — the deeplink ships a placeholder so no secrets end up in shared links.

Why This One?

Feature	Status
Text chat with 300+ models	✅
Image analysis (vision)	✅ Native with sharp optimization
Audio analysis	✅ Transcription + analysis, base64 auto-encoded
Audio generation	✅ Conversational, speech, and music with format auto-detection
Image generation	✅ Path-sandboxed disk output
Video understanding	✅ v3 — mp4, mpeg, mov, webm from files, URLs, or data URLs
Video generation	✅ v3 — Veo 3.1 / Sora 2 Pro / Seedance / Wan via async API with progress notifications
Auto image resize + compress	✅ Configurable (defaults 800px max, JPEG 80%)
Model search + validation	✅ Filter by vision / audio / video modality
Free model support	✅ Default: free Nemotron VL
Docker support	✅ Multi-arch (amd64 + arm64), ~345 MB Alpine
Retry-After + jitter	✅ Honors `Retry-After` header, avoids thundering herd
IPv4 + IPv6 SSRF blocklist	✅ Covers mapped, compat, multicast, 6to4, Teredo, ORCHID
Structured error taxonomy	✅ Closed `_meta.code` so clients can switch on failure modes
Reasoning-model awareness	✅ Detects `max_tokens` cutoff during CoT, guides the caller
MCP 2025 tool annotations	✅ `readOnlyHint` / `destructiveHint` / `idempotentHint` on every tool

Tools

Tool	Description
`chat_completion`	Send messages to any OpenRouter model. Detects reasoning-model cutoffs.
`analyze_image`	Analyze images from local files, URLs, or data URIs. Auto-optimized with sharp.
`analyze_audio`	Analyze/transcribe audio (WAV, MP3, FLAC, OGG, etc.) from files, URLs, or data URIs.
`analyze_video`	Analyze/transcribe video (mp4, mpeg, mov, webm) from files, URLs, or data URIs.
`generate_image`	Generate images from text prompts. Optional path-sandboxed disk save.
`generate_audio`	Generate audio from text. Auto-detects format, wraps raw PCM in WAV.
`generate_video`	Generate video via OpenRouter's async API (Veo 3.1 / Sora 2 Pro / Seedance / Wan). Submits, polls, downloads, saves.
`get_video_status`	Resume polling a `generate_video` job by id. Download + save when complete.
`search_models`	Search/filter models by name, provider, or capabilities (vision / audio / video).
`get_model_info`	Get pricing, context length, and capabilities for any model.
`validate_model`	Check if a model ID exists on OpenRouter.

All error responses carry _meta.code from a closed taxonomy: INVALID_INPUT · UNSAFE_PATH · UPSTREAM_HTTP · UPSTREAM_TIMEOUT · UPSTREAM_REFUSED · UNSUPPORTED_FORMAT · RESOURCE_TOO_LARGE · ZDR_INCOMPATIBLE · MODEL_NOT_FOUND · JOB_FAILED · JOB_STILL_RUNNING · INTERNAL

Quick Start

Prerequisites

Get a free API key from openrouter.ai/keys.

Option 1: npx (no install)

{
  "mcpServers": {
    "openrouter": {
      "command": "npx",
      "args": ["-y", "@stabgan/openrouter-mcp-multimodal"],
      "env": {
        "OPENROUTER_API_KEY": "sk-or-v1-..."
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "openrouter": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "OPENROUTER_API_KEY=sk-or-v1-...",
        "stabgan/openrouter-mcp-multimodal:latest"
      ]
    }
  }
}

Option 3: Global install

npm install -g @stabgan/openrouter-mcp-multimodal

{
  "mcpServers": {
    "openrouter": {
      "command": "openrouter-multimodal",
      "env": { "OPENROUTER_API_KEY": "sk-or-v1-..." }
    }
  }
}

Option 4: Smithery

npx -y @smithery/cli install @stabgan/openrouter-mcp-multimodal --client claude

Configuration

Environment variables (click to expand)

Variable	Required	Default	Description
`OPENROUTER_API_KEY`	Yes	—	Your OpenRouter API key
`OPENROUTER_DEFAULT_MODEL`	No	`nvidia/nemotron-nano-12b-v2-vl:free`	Default model for chat + analyze tools
`DEFAULT_MODEL`	No	—	Alias for above
`OPENROUTER_MODEL_CACHE_TTL_MS`	No	`3600000`	Model cache TTL (ms)
`OPENROUTER_IMAGE_MAX_DIMENSION`	No	`800`	Longest edge for resize (px)
`OPENROUTER_IMAGE_JPEG_QUALITY`	No	`80`	JPEG quality (1–100)
`OPENROUTER_IMAGE_FETCH_TIMEOUT_MS`	No	`30000`	Image URL timeout
`OPENROUTER_IMAGE_MAX_DOWNLOAD_BYTES`	No	`26214400`	Image URL size cap (~25 MB)
`OPENROUTER_IMAGE_MAX_REDIRECTS`	No	`8`	Image URL redirect cap
`OPENROUTER_IMAGE_MAX_DATA_URL_BYTES`	No	`20971520`	Image data URL size cap (~20 MB)
`OPENROUTER_AUDIO_FETCH_TIMEOUT_MS`	No	`30000`	Audio URL timeout
`OPENROUTER_AUDIO_MAX_DOWNLOAD_BYTES`	No	`26214400`	Audio URL size cap (~25 MB)
`OPENROUTER_AUDIO_MAX_REDIRECTS`	No	`8`	Audio URL redirect cap
`OPENROUTER_AUDIO_MAX_DATA_URL_BYTES`	No	`20971520`	Audio data URL size cap
`OPENROUTER_DEFAULT_VIDEO_MODEL`	No	`google/gemini-2.5-flash`	Default for `analyze_video`
`OPENROUTER_DEFAULT_VIDEO_GEN_MODEL`	No	`google/veo-3.1`	Default for `generate_video`
`OPENROUTER_VIDEO_FETCH_TIMEOUT_MS`	No	`60000`	Video URL timeout
`OPENROUTER_VIDEO_MAX_DOWNLOAD_BYTES`	No	`104857600`	Video URL size cap (~100 MB)
`OPENROUTER_VIDEO_MAX_REDIRECTS`	No	`8`	Video URL redirect cap
`OPENROUTER_VIDEO_MAX_DATA_URL_BYTES`	No	`104857600`	Video data URL size cap
`OPENROUTER_VIDEO_POLL_INTERVAL_MS`	No	`15000`	Async video poll cadence
`OPENROUTER_VIDEO_MAX_WAIT_MS`	No	`600000`	Max wait before returning a resumable handle
`OPENROUTER_VIDEO_GEN_MAX_BYTES`	No	`268435456`	Generated video download cap (~256 MB)
`OPENROUTER_VIDEO_INLINE_MAX_BYTES`	No	`10485760`	Inline video ceiling (~10 MB)
`OPENROUTER_OUTPUT_DIR`	No	`process.cwd()`	Sandbox root for `save_path`
`OPENROUTER_ALLOW_UNSAFE_PATHS`	No	—	`1` disables the sandbox
`OPENROUTER_LOG_LEVEL`	No	`info`	`error` / `warn` / `info` / `debug`

Security notes

Analyze tools can read local files and fetch HTTP(S) URLs. URL fetches block private/link-local/reserved IPv4 and IPv6 targets (SSRF mitigation) and cap response size.
Generate tools write to disk through a path sandbox: save_path is resolved against OPENROUTER_OUTPUT_DIR and any traversal attempt is rejected. Override with OPENROUTER_ALLOW_UNSAFE_PATHS=1.
IPv6 SSRF blocklist covers loopback, unspecified, IPv4-mapped, IPv4-compatible, link-local, site-local, ULA, multicast, documentation, Teredo, ORCHID, and 6to4 of private IPv4.

Usage Examples

# Chat
Use chat_completion to explain quantum computing in simple terms.

# Vision
Use analyze_image on /path/to/photo.jpg and tell me what you see.

# Audio transcription
Use analyze_audio on /path/to/recording.mp3 to transcribe it.

# Video understanding
Use analyze_video on /path/to/clip.mp4 — what happens at 00:15?

# Generate audio
Use generate_audio with prompt "Explain neural networks" and voice "alloy", save to ./response.wav

# Generate music
Use generate_audio with model "google/lyria-3-clip-preview" and prompt "upbeat jazz piano trio"

# Generate image
Use generate_image with prompt "a cat astronaut on mars" and save to ./cat.png

# Generate video
Use generate_video with model "google/veo-3.1", prompt "a calm river at sunrise",
resolution 720p, duration 4, save to ./river.mp4

# Resume a video job
Use get_video_status with video_id "vid_abc123" and save_path "./river.mp4"

Architecture

src/
├── index.ts                    # Entry, env validation, graceful shutdown
├── tool-handlers.ts            # 11 tools (annotated) + dispatch
├── model-cache.ts              # TTL + in-flight coalescing
├── openrouter-api.ts           # REST client (chat + /videos)
├── errors.ts                   # Closed ErrorCode enum
├── logger.ts                   # JSON-line structured logger
└── tool-handlers/
    ├── fetch-utils.ts          # SSRF, bounded fetch, data-URL parser
    ├── openrouter-errors.ts    # SDK/HTTP → ErrorCode classifier
    ├── completion-utils.ts     # Reasoning-model cutoff detection
    ├── path-safety.ts          # save_path sandbox
    ├── chat-completion.ts      # Text + multimodal chat
    ├── analyze-image.ts        # Vision analysis
    ├── analyze-audio.ts        # Audio transcription
    ├── analyze-video.ts        # Video understanding
    ├── generate-image.ts       # Image generation
    ├── generate-audio.ts       # Audio generation + streaming
    ├── generate-video.ts       # Video generation (async)
    ├── image-utils.ts          # Sharp optimization, MIME sniffing
    ├── audio-utils.ts          # Audio format detection
    ├── video-utils.ts          # Video format detection
    ├── search-models.ts        # Model search
    ├── get-model-info.ts       # Model detail lookup
    └── validate-model.ts       # Model existence check

Development

git clone https://github.com/stabgan/openrouter-mcp-multimodal.git
cd openrouter-mcp-multimodal
npm install
cp .env.example .env  # Add your API key
npm run build
npm start

npm test                    # 163 unit tests, <1s
npm run test:integration    # Live API tests
npm run lint
node scripts/live-e2e.mjs  # 16 live E2E scenarios

Upgrading from v2

v3 is additive — no tool schemas or env vars were removed.

Three new tools: analyze_video, generate_video, get_video_status
Structured _meta.code on every error response (text messages preserved)
save_path sandboxed by default — set OPENROUTER_OUTPUT_DIR or OPENROUTER_ALLOW_UNSAFE_PATHS=1
Reasoning-model awareness: content: null + finish_reason: length now returns INVALID_INPUT with a preview instead of empty string
IPv6 SSRF coverage extended to mapped, compat, multicast, 6to4, Teredo, ORCHID

Compatibility

Works with any MCP client: Kiro · Claude Desktop · Cursor · Windsurf · Cline · any MCP-compatible client.

License

MIT

Contributing

Issues and PRs welcome. Please open an issue first for major changes.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.github/workflows		.github/workflows
assets		assets
scripts		scripts
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.prettierrc.json		.prettierrc.json
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
smithery.yaml		smithery.yaml
test.png		test.png
test_context.jpg		test_context.jpg
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
vitest.integration.config.ts		vitest.integration.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenRouter MCP Multimodal Server

One-Click Install

Why This One?

Tools

Quick Start

Prerequisites

Option 1: npx (no install)

Option 2: Docker

Option 3: Global install

Option 4: Smithery

Configuration

Security notes

Usage Examples

Architecture

Development

Upgrading from v2

Compatibility

License

Contributing

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenRouter MCP Multimodal Server

One-Click Install

Why This One?

Tools

Quick Start

Prerequisites

Option 1: npx (no install)

Option 2: Docker

Option 3: Global install

Option 4: Smithery

Configuration

Security notes

Usage Examples

Architecture

Development

Upgrading from v2

Compatibility

License

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages