Openrouter.ex

A production-ready Elixir SDK for OpenRouter, bringing the best of Pydantic AI and FastAPI patterns to Elixir AI development.

Why OpenRouter?

OpenRouter provides unified access to 200+ AI models through a single API:

OpenAI: GPT-4, GPT-3.5, o1, o1-mini, etc.
Anthropic: Claude 3.5 Sonnet, Claude 4, etc.
Google: Gemini 2.0 Flash, Gemini Pro, etc.
Meta: Llama 3.3 70B, Llama 3.1, etc.
Mistral: Mistral Large, Mixtral, etc.
And 200+ more models

One SDK, all models. No need to build separate clients for each provider.

Features

🎯 Backend-First Design

Production-ready with proper OTP supervision
GenServer-based stateful conversations
Connection pooling via Req/Finch
Built-in telemetry and observability
Exponential backoff retry logic

🤖 Agentic Workflows

Tool/Function Calling: LLMs can call Elixir functions
Automatic Tool Execution: Agent loops handle tool calls automatically
RunContext: Type-safe dependency injection (inspired by Pydantic AI)
Conversation Management: Both stateless and stateful APIs
Max Iterations Safety: Prevents infinite loops

🔒 Type Safety & Validation

Structured Outputs: Extract validated data with Ecto schemas
Automatic Retry: Retry with error feedback when validation fails
JSON Schema Generation: From your Ecto schemas
Compile-time Safety: Full typespec coverage

🎨 Multimodal Support

Images: URLs or base64-encoded (JPEG, PNG, GIF, WebP)
Video: MP4, WebM support
PDFs: Document processing
Audio: Audio file support
Content Builders: Ergonomic helpers for complex content

🔌 Phoenix Integration Ready

LiveView streaming support
Phoenix Channels integration
Oban background job examples
Supervision tree compatible

📊 Production Features

Cost Tracking & Budgeting: Track spending, set budgets, estimate costs
Token Counting: Estimate tokens and costs before API calls
Response Caching: Cache LLM responses and embeddings with TTL support
Prompt Templates: Reusable templates with variables and conditionals
Comprehensive telemetry events
Retry logic with exponential backoff
Rate limit handling
Error types and recovery
1700+ tests (unit + integration)

Quick Start

Installation

Add to your mix.exs:

def deps do
  [
    {:openrouter, "~> 0.1.0"}
  ]
end

Configure your API key:

# config/config.exs
config :openrouter,
  api_key: System.get_env("OPENROUTER_API_KEY")

# Or in config/runtime.exs (recommended for production)
config :openrouter,
  api_key: System.fetch_env!("OPENROUTER_API_KEY")

Simple Chat

# Basic question
{:ok, response} = Openrouter.chat(
  "What is the capital of France?",
  model: "anthropic/claude-3.5-sonnet"
)

IO.puts(response.content)
# => "The capital of France is Paris."

# With conversation history
messages = [
  %{role: :system, content: "You are a helpful assistant"},
  %{role: :user, content: "Hello!"},
  %{role: :assistant, content: "Hi! How can I help?"},
  %{role: :user, content: "What's the weather?"}
]

{:ok, response} = Openrouter.chat(messages, model: "openai/gpt-4")

Streaming Responses

{:ok, stream} = Openrouter.chat_stream(
  "Write me a story about a robot",
  model: "openai/gpt-4"
)

stream
|> Stream.each(fn chunk -> IO.write(chunk.content) end)
|> Stream.run()

Structured Data Extraction

Extract validated, typed data from unstructured text:

defmodule RecipeSchema do
  use Openrouter.Schema

  embedded_schema do
    field :name, :string
    field :ingredients, {:array, :string}
    field :steps, {:array, :string}
    field :prep_time, :integer
    field :difficulty, :string
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:name, :ingredients, :steps, :prep_time, :difficulty])
    |> validate_required([:name, :ingredients, :steps])
    |> validate_inclusion(:difficulty, ["easy", "medium", "hard"])
  end
end

{:ok, recipe} = Openrouter.extract(
  "Give me a recipe for chocolate chip cookies",
  schema: RecipeSchema,
  model: "openai/gpt-4"
)

# recipe is a validated RecipeSchema struct!
IO.inspect(recipe.name)
IO.inspect(recipe.ingredients)

The library automatically:

✅ Generates JSON schema from your Ecto schema
✅ Uses OpenRouter's native response_format for structured outputs
✅ Validates the LLM response against your schema
✅ Retries with error feedback if validation fails
✅ Returns a properly typed struct

Multimodal extraction:

# Extract structured data from images
defmodule ImageAnalysisSchema do
  use Openrouter.Schema

  embedded_schema do
    field :description, :string
    field :objects, {:array, :string}
    field :colors, {:array, :string}
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:description, :objects, :colors])
    |> validate_required([:description])
  end
end

messages = [
  %{role: :user, content: [
    Openrouter.Content.text("Analyze this image"),
    Openrouter.Content.image_url("https://example.com/photo.jpg")
  ]}
]

{:ok, analysis} = Openrouter.extract(
  messages,
  schema: ImageAnalysisSchema,
  model: "openai/gpt-4o"
)
# Returns validated struct with description, objects, and colors

Nested schemas with embeds_one and embeds_many:

defmodule AddressSchema do
  use Openrouter.Schema

  embedded_schema do
    field :street, :string
    field :city, :string
    field :country, :string
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:street, :city, :country])
    |> validate_required([:city, :country])
  end
end

defmodule PersonSchema do
  use Openrouter.Schema

  embedded_schema do
    field :name, :string
    field :age, :integer
    embeds_one :address, AddressSchema  # Single nested object
    embeds_many :phone_numbers, PhoneSchema  # Array of objects
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:name, :age])
    |> cast_embed(:address)
    |> cast_embed(:phone_numbers)
    |> validate_required([:name])
  end
end

{:ok, person} = Openrouter.extract(
  "John Doe, 30 years old, lives at 123 Main St, New York, USA",
  schema: PersonSchema,
  model: "openai/gpt-4"
)

IO.puts(person.address.city)  # "New York"

Tool Calling & Agentic Workflows

Define tools that LLMs can call:

# Define a tool
weather_tool = Openrouter.Tool.new(
  :get_weather,
  "Get current weather for a location",
  fn %{location: location} ->
    # Your implementation
    {:ok, "Sunny, 72°F in #{location}"}
  end,
  parameters: %{
    location: [type: :string, required: true, description: "City name"]
  }
)

# Agent automatically handles tool execution loop
{:ok, result} = Openrouter.Agent.run(
  "What's the weather in Paris?",
  model: "openai/gpt-4",
  tools: [weather_tool]
)

IO.puts(result.content)
# => "The weather in Paris is sunny and 72°F"

Dependency Injection with RunContext

Pass dependencies to context-aware tools:

# Define your dependencies
defmodule AppDeps do
  defstruct [:db_conn, :user_id, :api_key]
end

# Context-aware tool
balance_tool = Openrouter.Tool.new(
  :get_balance,
  "Get user account balance",
  fn ctx, _params ->
    # ctx.deps contains your AppDeps struct
    balance = MyApp.DB.get_balance(ctx.deps.db_conn, ctx.deps.user_id)
    {:ok, balance}
  end,
  context_aware: true
)

# Run with dependencies
deps = %AppDeps{db_conn: MyApp.Repo, user_id: 123, api_key: "secret"}

{:ok, result} = Openrouter.Agent.run(
  "What's my balance?",
  model: "gpt-4",
  tools: [balance_tool],
  deps: deps
)

Conversation Management

Stateless (Functional):

{:ok, conv} = Openrouter.Conversation.start(
  model: "gpt-4",
  system: "You are a helpful assistant"
)

conv = Openrouter.Conversation.user(conv, "Hello!")
{:ok, conv, response} = Openrouter.Conversation.complete(conv)

conv = Openrouter.Conversation.user(conv, "Tell me more")
{:ok, conv, response} = Openrouter.Conversation.complete(conv)

# Save for later
:ok = Openrouter.Conversation.save(conv, to: :ets)

Stateful (GenServer):

{:ok, pid} = Openrouter.ConversationServer.start_link(
  model: "gpt-4",
  system: "You are a helpful assistant"
)

{:ok, response1} = Openrouter.ConversationServer.send_message(pid, "Hello!")
{:ok, response2} = Openrouter.ConversationServer.send_message(pid, "Tell me more")

# State is automatically maintained!

Multimodal Content

# Image from URL
{:ok, response} = Openrouter.chat([
  %{
    role: :user,
    content: [
      Openrouter.Content.text("What's in this image?"),
      Openrouter.Content.image_url("https://example.com/image.jpg")
    ]
  }
], model: "anthropic/claude-3.5-sonnet")

# Local image (base64 encoded)
image_data = File.read!("photo.jpg")
{:ok, response} = Openrouter.chat([
  %{
    role: :user,
    content: [
      Openrouter.Content.text("Describe this image"),
      Openrouter.Content.image(image_data, format: :jpeg)
    ]
  }
], model: "openai/gpt-4o")

# Content builder
content = Openrouter.Content.build([
  text: "Analyze these files",
  image_url: "https://example.com/chart.png",
  pdf: "https://example.com/report.pdf"
])

Embeddings

# Single text
{:ok, [embedding]} = Openrouter.embed(
  "The quick brown fox",
  model: "openai/text-embedding-3-small"
)

# Batch embeddings
texts = ["Hello world", "Goodbye world", "How are you?"]
{:ok, embeddings} = Openrouter.embed(texts, model: "openai/text-embedding-3-small")

# Calculate similarity
similarity = Openrouter.embed_similarity(embedding1, embedding2)

Retry Logic

# Automatic retry with exponential backoff
{:ok, response} = Openrouter.Retry.with_retry(
  fn -> Openrouter.chat("Hello", model: "gpt-4") end,
  max_attempts: 5,
  base_delay: 1000,
  retry_on: [:rate_limit, :server_error, :timeout]
)

Telemetry & Observability

# Attach default handler
Openrouter.Telemetry.attach_default_handler(level: :info)

# Or custom handler
:telemetry.attach(
  "my-handler",
  [:openrouter, :request, :stop],
  fn _event, measurements, metadata, _config ->
    Logger.info("Request completed",
      duration: measurements.duration,
      model: metadata.model,
      tokens: metadata.tokens
    )
  end,
  nil
)

Cost Tracking & Budgeting

Track and control AI spending with built-in cost tracking utilities:

# Enable cost tracking in API requests
{:ok, response} = Openrouter.chat(
  "Hello",
  model: "openai/gpt-4",
  usage: %{include: true}  # ← Enable cost tracking
)

# Cost information in response
IO.puts("Cost: $#{response.usage.total_cost}")
IO.puts("Tokens: #{response.usage.total_tokens}")
IO.puts("Native tokens (billing): #{response.usage.native_tokens_prompt + response.usage.native_tokens_completion}")

Estimate costs before making requests:

# Estimate token count and cost
messages = [%{role: :user, content: "Write a poem about Elixir"}]

{:ok, estimate} = Openrouter.TokenCounter.estimate_cost(
  messages,
  model: "openai/gpt-4",
  max_tokens: 500
)

IO.puts("Estimated cost: $#{estimate.total_cost}")
IO.puts("Input tokens: #{estimate.input_tokens}")
IO.puts("Output tokens: #{estimate.output_tokens}")

Track spending with CostTracker:

# Start tracker with budget
{:ok, tracker} = Openrouter.CostTracker.start_link(budget: 50.00)

# Check budget before making requests
case Openrouter.CostTracker.check_budget(tracker) do
  :ok ->
    {:ok, response} = Openrouter.chat("Hello", usage: %{include: true})
    :ok = Openrouter.CostTracker.track(tracker, response)

  {:warning, remaining} ->
    Logger.warning("Only $#{remaining} remaining in budget!")

  {:exceeded, amount} ->
    Logger.error("Budget exceeded by $#{amount}")
end

# Get usage statistics
stats = Openrouter.CostTracker.get_stats(tracker)
IO.puts("Total spent: $#{stats.total_cost}")
IO.puts("By model: #{inspect(stats.by_model)}")
IO.puts("By session: #{inspect(stats.by_session)}")

# Generate detailed report
report = Openrouter.CostTracker.format_report(tracker)
IO.puts(report)

Per-session cost tracking:

{:ok, tracker} = Openrouter.CostTracker.start_link()

# Track costs per conversation
{:ok, response} = Openrouter.chat("Hello", usage: %{include: true})
:ok = Openrouter.CostTracker.track(tracker, response, session_id: "conv-123")

# Get session-specific costs
session_stats = Openrouter.CostTracker.get_session_stats(tracker, "conv-123")
IO.puts("Conversation cost: $#{session_stats.cost}")

Response Caching

Reduce costs and latency by caching LLM responses and embeddings:

# Start cache server
{:ok, cache} = Openrouter.Cache.start_link(
  backend: :ets,
  max_size: 10_000,
  default_ttl: :timer.hours(24),
  eviction_policy: :lru
)

# Cache chat responses automatically
cache_key = Openrouter.Cache.chat_key(messages, model: "gpt-4", temperature: 0)

result = Openrouter.Cache.fetch(cache, cache_key, fn ->
  Openrouter.chat(messages, model: "gpt-4", temperature: 0)
end, ttl: :timer.hours(24))

# Cache embeddings (deterministic, so use infinite TTL)
embedding = Openrouter.Cache.fetch_embedding(cache, "hello world",
  model: "text-embedding-ada-002",
  ttl: :infinity,
  compute_fn: fn -> Openrouter.embeddings("hello world", model: "...") end
)

# View cache statistics
stats = Openrouter.Cache.stats(cache)
IO.puts("Hit rate: #{stats.hit_rate * 100}%")
IO.puts("Cache size: #{stats.size} entries")

Features:

In-memory (map) and ETS backends
TTL-based expiration
LRU and FIFO eviction policies
Automatic cache key generation
Statistics tracking (hits, misses, memory usage)

Prompt Templates

Create reusable prompt templates with variable substitution and conditionals:

# Define a template
template = Openrouter.PromptTemplate.new("""
You are a {{role}} expert in {{domain}}.

User Question: {{question}}

{{#if context}}
Relevant Context:
{{context}}
{{/if}}

Please provide a {{style}} answer.
""",
  defaults: %{style: "detailed", role: "helpful assistant"}
)

# Render with variables
{:ok, prompt} = Openrouter.PromptTemplate.render(template,
  domain: "Elixir programming",
  question: "How do GenServers work?",
  context: "The user is building a real-time chat application",
  style: "concise"
)

# Use in API call
{:ok, response} = Openrouter.chat(prompt, model: "gpt-4")

Features:

Variable substitution with {{variable}} syntax
Conditional blocks with {{#if var}}...{{/if}}
Default values for optional variables
Template composition (combine multiple templates)
Load templates from files
Validation for missing required variables

Examples

The examples/ directory contains comprehensive examples:

Core Features

basic_usage.exs - Chat, streaming, embeddings
structured_outputs.exs - Data extraction with Ecto schemas
nested_schemas.exs - Complex nested data structures (embeds_one, embeds_many)
production_features.exs - Multimodal, retry, telemetry
tool_calling.exs - Tool/function calling with agents
run_context.exs - Dependency injection patterns
conversation.exs - Stateless and stateful conversations
cost_tracking.exs - Cost tracking, budgeting, and token estimation
caching.exs - Response and embedding caching with TTL and eviction
prompt_templates.exs - Reusable prompts with variables and conditionals

Advanced Patterns

rag.exs - RAG (Retrieval Augmented Generation) with vector search
web_search.exs - Web search integration and multi-tool agents
phoenix_liveview.exs - Complete Phoenix LiveView chat application
multi_agent.exs - Multi-agent collaboration and coordination

Run with:

mix run examples/basic_usage.exs
mix run examples/cost_tracking.exs
mix run examples/caching.exs
mix run examples/prompt_templates.exs
mix run examples/rag.exs
mix run examples/phoenix_liveview.exs

Phoenix Integration

LiveView Streaming

defmodule MyAppWeb.ChatLive do
  use Phoenix.LiveView

  def handle_event("send_message", %{"message" => msg}, socket) do
    task = Task.async(fn ->
      Openrouter.chat_stream(msg, model: "gpt-4")
    end)

    {:noreply, assign(socket, task: task, streaming: true)}
  end

  def handle_info({ref, {:ok, stream}}, socket) when socket.assigns.task.ref == ref do
    for chunk <- stream do
      send(self(), {:chunk, chunk})
    end
    {:noreply, socket}
  end

  def handle_info({:chunk, %{content: text}}, socket) do
    # Update UI with new text
    {:noreply, stream_insert(socket, :chunks, %{text: text})}
  end
end

With ConversationServer

defmodule MyApp.ChatSession do
  use Openrouter.ConversationServer

  def start_link(user_id) do
    Openrouter.ConversationServer.start_link(__MODULE__,
      name: via_tuple(user_id),
      model: "gpt-4",
      system: "You are a helpful assistant"
    )
  end

  defp via_tuple(user_id) do
    {:via, Registry, {MyApp.Registry, {__MODULE__, user_id}}}
  end
end

# In your application supervisor
children = [
  {Registry, keys: :unique, name: MyApp.Registry},
  # ... other children
]

# Usage
{:ok, _pid} = MyApp.ChatSession.start_link(user.id)
{:ok, response} = Openrouter.ConversationServer.send_message(
  {:via, Registry, {MyApp.Registry, {MyApp.ChatSession, user.id}}},
  "Hello!"
)

Testing

The library includes 1350+ tests covering:

Unit tests: All modules with comprehensive coverage
Integration tests: Real API calls (with Reqord support for record/replay)

Run tests:

# Unit tests only (no API key needed)
mix test

# Integration tests (requires API key)
OPENROUTER_API_KEY=your_key mix test --only integration

# Specific test suites
mix test --only chat
mix test --only streaming
mix test --only structured_outputs
mix test --only tool_calling
mix test --only conversation

Using Reqord for Record/Replay

# Record API interactions
REQORD_MODE=record OPENROUTER_API_KEY=your_key mix test --only integration

# Replay recorded interactions (no API key needed)
REQORD_MODE=replay mix test --only integration

Configuration

# config/config.exs
config :openrouter,
  api_key: System.get_env("OPENROUTER_API_KEY"),
  base_url: "https://openrouter.ai/api/v1",
  default_model: "anthropic/claude-3.5-sonnet",
  app_name: "my-app",  # Optional: for OpenRouter tracking
  site_url: "https://myapp.com"  # Optional

# config/runtime.exs (recommended for production)
config :openrouter,
  api_key: System.fetch_env!("OPENROUTER_API_KEY")

Architecture & Design

This library is heavily inspired by Pydantic AI and adopts many of its best patterns:

Dependency Injection via RunContext - Type-safe context passing
Generic Agent Types - Agents typed over dependencies
Structured Outputs - Validation with automatic retry
Progressive Disclosure - Simple for basic use, powerful for advanced
Testing-First - Easy mocking with behaviors

See DESIGN.md for the complete design document.

Development Status

✅ Phase 1: Core Foundation - COMPLETE

Core types (Message, Response, Error, Usage)
HTTP layer with Req
OpenRouter provider
Basic streaming
Configuration & validation

✅ Phase 2: Structured Outputs - COMPLETE

Ecto schema integration
JSON schema generation
Automatic validation
Retry with error feedback
Complex type support

✅ Phase 3: Production Features - COMPLETE

Multimodal content (images, video, PDFs)
Retry logic with exponential backoff
Comprehensive telemetry
Content builders
Production observability

✅ Phase 4: Agentic Workflows - COMPLETE

RunContext & dependency injection
Tool/function calling
Agent framework with automatic execution
Conversation management (stateless)
ConversationServer (stateful GenServer)

🚧 Phase 5: Polish & Documentation - IN PROGRESS

Comprehensive documentation
More examples
Performance optimization
Production guides

Documentation

API Documentation - Full API reference
DESIGN.md - Complete design document
DEPLOYMENT.md - Production deployment guide
PYDANTIC_AI_ANALYSIS.md - Analysis of Pydantic AI
Integration Tests README - Testing guide

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Add tests for your changes
Ensure all tests pass
Submit a pull request

Roadmap

License

MIT License - see LICENSE for details

Credits

Inspired by:

Pydantic AI - Design patterns and DX
Vercel AI SDK - Excellent developer experience
FastAPI - Progressive disclosure philosophy

Built with:

Req - Modern HTTP client
Ecto - Schema validation
Telemetry - Observability

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
OpenRouter: OpenRouter Documentation

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
config		config
examples		examples
lib		lib
nix		nix
test		test
.coveralls.exs		.coveralls.exs
.envrc		.envrc
.formatter.exs		.formatter.exs
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
DESIGN.md		DESIGN.md
PYDANTIC_AI_ANALYSIS.md		PYDANTIC_AI_ANALYSIS.md
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
mix.exs		mix.exs
mix.lock		mix.lock

Folders and files

Latest commit

History

Repository files navigation

Openrouter.ex

Why OpenRouter?

Features

🎯 Backend-First Design

🤖 Agentic Workflows

🔒 Type Safety & Validation

🎨 Multimodal Support

🔌 Phoenix Integration Ready

📊 Production Features

Quick Start

Installation

Simple Chat

Streaming Responses

Structured Data Extraction

Tool Calling & Agentic Workflows

Dependency Injection with RunContext

Conversation Management

Multimodal Content

Embeddings

Retry Logic

Telemetry & Observability

Cost Tracking & Budgeting

Response Caching

Prompt Templates

Examples

Core Features

Advanced Patterns

Phoenix Integration

LiveView Streaming

With ConversationServer

Testing

Using Reqord for Record/Replay

Configuration

Architecture & Design

Development Status

✅ Phase 1: Core Foundation - COMPLETE

✅ Phase 2: Structured Outputs - COMPLETE

✅ Phase 3: Production Features - COMPLETE

✅ Phase 4: Agentic Workflows - COMPLETE

🚧 Phase 5: Polish & Documentation - IN PROGRESS

Documentation

Contributing

Roadmap

License

Credits

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages