Skip to content

aus70/openrouter_forked.ex

 
 

Repository files navigation

Openrouter.ex

Hex.pm Documentation

A production-ready Elixir SDK for OpenRouter, bringing the best of Pydantic AI and FastAPI patterns to Elixir AI development.

Why OpenRouter?

OpenRouter provides unified access to 200+ AI models through a single API:

  • OpenAI: GPT-4, GPT-3.5, o1, o1-mini, etc.
  • Anthropic: Claude 3.5 Sonnet, Claude 4, etc.
  • Google: Gemini 2.0 Flash, Gemini Pro, etc.
  • Meta: Llama 3.3 70B, Llama 3.1, etc.
  • Mistral: Mistral Large, Mixtral, etc.
  • And 200+ more models

One SDK, all models. No need to build separate clients for each provider.

Features

🎯 Backend-First Design

  • Production-ready with proper OTP supervision
  • GenServer-based stateful conversations
  • Connection pooling via Req/Finch
  • Built-in telemetry and observability
  • Exponential backoff retry logic

🤖 Agentic Workflows

  • Tool/Function Calling: LLMs can call Elixir functions
  • Automatic Tool Execution: Agent loops handle tool calls automatically
  • RunContext: Type-safe dependency injection (inspired by Pydantic AI)
  • Conversation Management: Both stateless and stateful APIs
  • Max Iterations Safety: Prevents infinite loops

🔒 Type Safety & Validation

  • Structured Outputs: Extract validated data with Ecto schemas
  • Automatic Retry: Retry with error feedback when validation fails
  • JSON Schema Generation: From your Ecto schemas
  • Compile-time Safety: Full typespec coverage

🎨 Multimodal Support

  • Images: URLs or base64-encoded (JPEG, PNG, GIF, WebP)
  • Video: MP4, WebM support
  • PDFs: Document processing
  • Audio: Audio file support
  • Content Builders: Ergonomic helpers for complex content

🔌 Phoenix Integration Ready

  • LiveView streaming support
  • Phoenix Channels integration
  • Oban background job examples
  • Supervision tree compatible

📊 Production Features

  • Cost Tracking & Budgeting: Track spending, set budgets, estimate costs
  • Token Counting: Estimate tokens and costs before API calls
  • Response Caching: Cache LLM responses and embeddings with TTL support
  • Prompt Templates: Reusable templates with variables and conditionals
  • Comprehensive telemetry events
  • Retry logic with exponential backoff
  • Rate limit handling
  • Error types and recovery
  • 1700+ tests (unit + integration)

Quick Start

Installation

Add to your mix.exs:

def deps do
  [
    {:openrouter, "~> 0.1.0"}
  ]
end

Configure your API key:

# config/config.exs
config :openrouter,
  api_key: System.get_env("OPENROUTER_API_KEY")

# Or in config/runtime.exs (recommended for production)
config :openrouter,
  api_key: System.fetch_env!("OPENROUTER_API_KEY")

Simple Chat

# Basic question
{:ok, response} = Openrouter.chat(
  "What is the capital of France?",
  model: "anthropic/claude-3.5-sonnet"
)

IO.puts(response.content)
# => "The capital of France is Paris."

# With conversation history
messages = [
  %{role: :system, content: "You are a helpful assistant"},
  %{role: :user, content: "Hello!"},
  %{role: :assistant, content: "Hi! How can I help?"},
  %{role: :user, content: "What's the weather?"}
]

{:ok, response} = Openrouter.chat(messages, model: "openai/gpt-4")

Streaming Responses

{:ok, stream} = Openrouter.chat_stream(
  "Write me a story about a robot",
  model: "openai/gpt-4"
)

stream
|> Stream.each(fn chunk -> IO.write(chunk.content) end)
|> Stream.run()

Structured Data Extraction

Extract validated, typed data from unstructured text:

defmodule RecipeSchema do
  use Openrouter.Schema

  embedded_schema do
    field :name, :string
    field :ingredients, {:array, :string}
    field :steps, {:array, :string}
    field :prep_time, :integer
    field :difficulty, :string
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:name, :ingredients, :steps, :prep_time, :difficulty])
    |> validate_required([:name, :ingredients, :steps])
    |> validate_inclusion(:difficulty, ["easy", "medium", "hard"])
  end
end

{:ok, recipe} = Openrouter.extract(
  "Give me a recipe for chocolate chip cookies",
  schema: RecipeSchema,
  model: "openai/gpt-4"
)

# recipe is a validated RecipeSchema struct!
IO.inspect(recipe.name)
IO.inspect(recipe.ingredients)

The library automatically:

  • ✅ Generates JSON schema from your Ecto schema
  • ✅ Uses OpenRouter's native response_format for structured outputs
  • ✅ Validates the LLM response against your schema
  • ✅ Retries with error feedback if validation fails
  • ✅ Returns a properly typed struct

Multimodal extraction:

# Extract structured data from images
defmodule ImageAnalysisSchema do
  use Openrouter.Schema

  embedded_schema do
    field :description, :string
    field :objects, {:array, :string}
    field :colors, {:array, :string}
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:description, :objects, :colors])
    |> validate_required([:description])
  end
end

messages = [
  %{role: :user, content: [
    Openrouter.Content.text("Analyze this image"),
    Openrouter.Content.image_url("https://example.com/photo.jpg")
  ]}
]

{:ok, analysis} = Openrouter.extract(
  messages,
  schema: ImageAnalysisSchema,
  model: "openai/gpt-4o"
)
# Returns validated struct with description, objects, and colors

Nested schemas with embeds_one and embeds_many:

defmodule AddressSchema do
  use Openrouter.Schema

  embedded_schema do
    field :street, :string
    field :city, :string
    field :country, :string
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:street, :city, :country])
    |> validate_required([:city, :country])
  end
end

defmodule PersonSchema do
  use Openrouter.Schema

  embedded_schema do
    field :name, :string
    field :age, :integer
    embeds_one :address, AddressSchema  # Single nested object
    embeds_many :phone_numbers, PhoneSchema  # Array of objects
  end

  def changeset(schema, attrs) do
    schema
    |> cast(attrs, [:name, :age])
    |> cast_embed(:address)
    |> cast_embed(:phone_numbers)
    |> validate_required([:name])
  end
end

{:ok, person} = Openrouter.extract(
  "John Doe, 30 years old, lives at 123 Main St, New York, USA",
  schema: PersonSchema,
  model: "openai/gpt-4"
)

IO.puts(person.address.city)  # "New York"

Tool Calling & Agentic Workflows

Define tools that LLMs can call:

# Define a tool
weather_tool = Openrouter.Tool.new(
  :get_weather,
  "Get current weather for a location",
  fn %{location: location} ->
    # Your implementation
    {:ok, "Sunny, 72°F in #{location}"}
  end,
  parameters: %{
    location: [type: :string, required: true, description: "City name"]
  }
)

# Agent automatically handles tool execution loop
{:ok, result} = Openrouter.Agent.run(
  "What's the weather in Paris?",
  model: "openai/gpt-4",
  tools: [weather_tool]
)

IO.puts(result.content)
# => "The weather in Paris is sunny and 72°F"

Dependency Injection with RunContext

Pass dependencies to context-aware tools:

# Define your dependencies
defmodule AppDeps do
  defstruct [:db_conn, :user_id, :api_key]
end

# Context-aware tool
balance_tool = Openrouter.Tool.new(
  :get_balance,
  "Get user account balance",
  fn ctx, _params ->
    # ctx.deps contains your AppDeps struct
    balance = MyApp.DB.get_balance(ctx.deps.db_conn, ctx.deps.user_id)
    {:ok, balance}
  end,
  context_aware: true
)

# Run with dependencies
deps = %AppDeps{db_conn: MyApp.Repo, user_id: 123, api_key: "secret"}

{:ok, result} = Openrouter.Agent.run(
  "What's my balance?",
  model: "gpt-4",
  tools: [balance_tool],
  deps: deps
)

Conversation Management

Stateless (Functional):

{:ok, conv} = Openrouter.Conversation.start(
  model: "gpt-4",
  system: "You are a helpful assistant"
)

conv = Openrouter.Conversation.user(conv, "Hello!")
{:ok, conv, response} = Openrouter.Conversation.complete(conv)

conv = Openrouter.Conversation.user(conv, "Tell me more")
{:ok, conv, response} = Openrouter.Conversation.complete(conv)

# Save for later
:ok = Openrouter.Conversation.save(conv, to: :ets)

Stateful (GenServer):

{:ok, pid} = Openrouter.ConversationServer.start_link(
  model: "gpt-4",
  system: "You are a helpful assistant"
)

{:ok, response1} = Openrouter.ConversationServer.send_message(pid, "Hello!")
{:ok, response2} = Openrouter.ConversationServer.send_message(pid, "Tell me more")

# State is automatically maintained!

Multimodal Content

# Image from URL
{:ok, response} = Openrouter.chat([
  %{
    role: :user,
    content: [
      Openrouter.Content.text("What's in this image?"),
      Openrouter.Content.image_url("https://example.com/image.jpg")
    ]
  }
], model: "anthropic/claude-3.5-sonnet")

# Local image (base64 encoded)
image_data = File.read!("photo.jpg")
{:ok, response} = Openrouter.chat([
  %{
    role: :user,
    content: [
      Openrouter.Content.text("Describe this image"),
      Openrouter.Content.image(image_data, format: :jpeg)
    ]
  }
], model: "openai/gpt-4o")

# Content builder
content = Openrouter.Content.build([
  text: "Analyze these files",
  image_url: "https://example.com/chart.png",
  pdf: "https://example.com/report.pdf"
])

Embeddings

# Single text
{:ok, [embedding]} = Openrouter.embed(
  "The quick brown fox",
  model: "openai/text-embedding-3-small"
)

# Batch embeddings
texts = ["Hello world", "Goodbye world", "How are you?"]
{:ok, embeddings} = Openrouter.embed(texts, model: "openai/text-embedding-3-small")

# Calculate similarity
similarity = Openrouter.embed_similarity(embedding1, embedding2)

Retry Logic

# Automatic retry with exponential backoff
{:ok, response} = Openrouter.Retry.with_retry(
  fn -> Openrouter.chat("Hello", model: "gpt-4") end,
  max_attempts: 5,
  base_delay: 1000,
  retry_on: [:rate_limit, :server_error, :timeout]
)

Telemetry & Observability

# Attach default handler
Openrouter.Telemetry.attach_default_handler(level: :info)

# Or custom handler
:telemetry.attach(
  "my-handler",
  [:openrouter, :request, :stop],
  fn _event, measurements, metadata, _config ->
    Logger.info("Request completed",
      duration: measurements.duration,
      model: metadata.model,
      tokens: metadata.tokens
    )
  end,
  nil
)

Cost Tracking & Budgeting

Track and control AI spending with built-in cost tracking utilities:

# Enable cost tracking in API requests
{:ok, response} = Openrouter.chat(
  "Hello",
  model: "openai/gpt-4",
  usage: %{include: true}  # ← Enable cost tracking
)

# Cost information in response
IO.puts("Cost: $#{response.usage.total_cost}")
IO.puts("Tokens: #{response.usage.total_tokens}")
IO.puts("Native tokens (billing): #{response.usage.native_tokens_prompt + response.usage.native_tokens_completion}")

Estimate costs before making requests:

# Estimate token count and cost
messages = [%{role: :user, content: "Write a poem about Elixir"}]

{:ok, estimate} = Openrouter.TokenCounter.estimate_cost(
  messages,
  model: "openai/gpt-4",
  max_tokens: 500
)

IO.puts("Estimated cost: $#{estimate.total_cost}")
IO.puts("Input tokens: #{estimate.input_tokens}")
IO.puts("Output tokens: #{estimate.output_tokens}")

Track spending with CostTracker:

# Start tracker with budget
{:ok, tracker} = Openrouter.CostTracker.start_link(budget: 50.00)

# Check budget before making requests
case Openrouter.CostTracker.check_budget(tracker) do
  :ok ->
    {:ok, response} = Openrouter.chat("Hello", usage: %{include: true})
    :ok = Openrouter.CostTracker.track(tracker, response)

  {:warning, remaining} ->
    Logger.warning("Only $#{remaining} remaining in budget!")

  {:exceeded, amount} ->
    Logger.error("Budget exceeded by $#{amount}")
end

# Get usage statistics
stats = Openrouter.CostTracker.get_stats(tracker)
IO.puts("Total spent: $#{stats.total_cost}")
IO.puts("By model: #{inspect(stats.by_model)}")
IO.puts("By session: #{inspect(stats.by_session)}")

# Generate detailed report
report = Openrouter.CostTracker.format_report(tracker)
IO.puts(report)

Per-session cost tracking:

{:ok, tracker} = Openrouter.CostTracker.start_link()

# Track costs per conversation
{:ok, response} = Openrouter.chat("Hello", usage: %{include: true})
:ok = Openrouter.CostTracker.track(tracker, response, session_id: "conv-123")

# Get session-specific costs
session_stats = Openrouter.CostTracker.get_session_stats(tracker, "conv-123")
IO.puts("Conversation cost: $#{session_stats.cost}")

Response Caching

Reduce costs and latency by caching LLM responses and embeddings:

# Start cache server
{:ok, cache} = Openrouter.Cache.start_link(
  backend: :ets,
  max_size: 10_000,
  default_ttl: :timer.hours(24),
  eviction_policy: :lru
)

# Cache chat responses automatically
cache_key = Openrouter.Cache.chat_key(messages, model: "gpt-4", temperature: 0)

result = Openrouter.Cache.fetch(cache, cache_key, fn ->
  Openrouter.chat(messages, model: "gpt-4", temperature: 0)
end, ttl: :timer.hours(24))

# Cache embeddings (deterministic, so use infinite TTL)
embedding = Openrouter.Cache.fetch_embedding(cache, "hello world",
  model: "text-embedding-ada-002",
  ttl: :infinity,
  compute_fn: fn -> Openrouter.embeddings("hello world", model: "...") end
)

# View cache statistics
stats = Openrouter.Cache.stats(cache)
IO.puts("Hit rate: #{stats.hit_rate * 100}%")
IO.puts("Cache size: #{stats.size} entries")

Features:

  • In-memory (map) and ETS backends
  • TTL-based expiration
  • LRU and FIFO eviction policies
  • Automatic cache key generation
  • Statistics tracking (hits, misses, memory usage)

Prompt Templates

Create reusable prompt templates with variable substitution and conditionals:

# Define a template
template = Openrouter.PromptTemplate.new("""
You are a {{role}} expert in {{domain}}.

User Question: {{question}}

{{#if context}}
Relevant Context:
{{context}}
{{/if}}

Please provide a {{style}} answer.
""",
  defaults: %{style: "detailed", role: "helpful assistant"}
)

# Render with variables
{:ok, prompt} = Openrouter.PromptTemplate.render(template,
  domain: "Elixir programming",
  question: "How do GenServers work?",
  context: "The user is building a real-time chat application",
  style: "concise"
)

# Use in API call
{:ok, response} = Openrouter.chat(prompt, model: "gpt-4")

Features:

  • Variable substitution with {{variable}} syntax
  • Conditional blocks with {{#if var}}...{{/if}}
  • Default values for optional variables
  • Template composition (combine multiple templates)
  • Load templates from files
  • Validation for missing required variables

Examples

The examples/ directory contains comprehensive examples:

Core Features

  • basic_usage.exs - Chat, streaming, embeddings
  • structured_outputs.exs - Data extraction with Ecto schemas
  • nested_schemas.exs - Complex nested data structures (embeds_one, embeds_many)
  • production_features.exs - Multimodal, retry, telemetry
  • tool_calling.exs - Tool/function calling with agents
  • run_context.exs - Dependency injection patterns
  • conversation.exs - Stateless and stateful conversations
  • cost_tracking.exs - Cost tracking, budgeting, and token estimation
  • caching.exs - Response and embedding caching with TTL and eviction
  • prompt_templates.exs - Reusable prompts with variables and conditionals

Advanced Patterns

  • rag.exs - RAG (Retrieval Augmented Generation) with vector search
  • web_search.exs - Web search integration and multi-tool agents
  • phoenix_liveview.exs - Complete Phoenix LiveView chat application
  • multi_agent.exs - Multi-agent collaboration and coordination

Run with:

mix run examples/basic_usage.exs
mix run examples/cost_tracking.exs
mix run examples/caching.exs
mix run examples/prompt_templates.exs
mix run examples/rag.exs
mix run examples/phoenix_liveview.exs

Phoenix Integration

LiveView Streaming

defmodule MyAppWeb.ChatLive do
  use Phoenix.LiveView

  def handle_event("send_message", %{"message" => msg}, socket) do
    task = Task.async(fn ->
      Openrouter.chat_stream(msg, model: "gpt-4")
    end)

    {:noreply, assign(socket, task: task, streaming: true)}
  end

  def handle_info({ref, {:ok, stream}}, socket) when socket.assigns.task.ref == ref do
    for chunk <- stream do
      send(self(), {:chunk, chunk})
    end
    {:noreply, socket}
  end

  def handle_info({:chunk, %{content: text}}, socket) do
    # Update UI with new text
    {:noreply, stream_insert(socket, :chunks, %{text: text})}
  end
end

With ConversationServer

defmodule MyApp.ChatSession do
  use Openrouter.ConversationServer

  def start_link(user_id) do
    Openrouter.ConversationServer.start_link(__MODULE__,
      name: via_tuple(user_id),
      model: "gpt-4",
      system: "You are a helpful assistant"
    )
  end

  defp via_tuple(user_id) do
    {:via, Registry, {MyApp.Registry, {__MODULE__, user_id}}}
  end
end

# In your application supervisor
children = [
  {Registry, keys: :unique, name: MyApp.Registry},
  # ... other children
]

# Usage
{:ok, _pid} = MyApp.ChatSession.start_link(user.id)
{:ok, response} = Openrouter.ConversationServer.send_message(
  {:via, Registry, {MyApp.Registry, {MyApp.ChatSession, user.id}}},
  "Hello!"
)

Testing

The library includes 1350+ tests covering:

  • Unit tests: All modules with comprehensive coverage
  • Integration tests: Real API calls (with Reqord support for record/replay)

Run tests:

# Unit tests only (no API key needed)
mix test

# Integration tests (requires API key)
OPENROUTER_API_KEY=your_key mix test --only integration

# Specific test suites
mix test --only chat
mix test --only streaming
mix test --only structured_outputs
mix test --only tool_calling
mix test --only conversation

Using Reqord for Record/Replay

# Record API interactions
REQORD_MODE=record OPENROUTER_API_KEY=your_key mix test --only integration

# Replay recorded interactions (no API key needed)
REQORD_MODE=replay mix test --only integration

Configuration

# config/config.exs
config :openrouter,
  api_key: System.get_env("OPENROUTER_API_KEY"),
  base_url: "https://openrouter.ai/api/v1",
  default_model: "anthropic/claude-3.5-sonnet",
  app_name: "my-app",  # Optional: for OpenRouter tracking
  site_url: "https://myapp.com"  # Optional

# config/runtime.exs (recommended for production)
config :openrouter,
  api_key: System.fetch_env!("OPENROUTER_API_KEY")

Architecture & Design

This library is heavily inspired by Pydantic AI and adopts many of its best patterns:

  • Dependency Injection via RunContext - Type-safe context passing
  • Generic Agent Types - Agents typed over dependencies
  • Structured Outputs - Validation with automatic retry
  • Progressive Disclosure - Simple for basic use, powerful for advanced
  • Testing-First - Easy mocking with behaviors

See DESIGN.md for the complete design document.

Development Status

✅ Phase 1: Core Foundation - COMPLETE

  • Core types (Message, Response, Error, Usage)
  • HTTP layer with Req
  • OpenRouter provider
  • Basic streaming
  • Configuration & validation

✅ Phase 2: Structured Outputs - COMPLETE

  • Ecto schema integration
  • JSON schema generation
  • Automatic validation
  • Retry with error feedback
  • Complex type support

✅ Phase 3: Production Features - COMPLETE

  • Multimodal content (images, video, PDFs)
  • Retry logic with exponential backoff
  • Comprehensive telemetry
  • Content builders
  • Production observability

✅ Phase 4: Agentic Workflows - COMPLETE

  • RunContext & dependency injection
  • Tool/function calling
  • Agent framework with automatic execution
  • Conversation management (stateless)
  • ConversationServer (stateful GenServer)

🚧 Phase 5: Polish & Documentation - IN PROGRESS

  • Comprehensive documentation
  • More examples
  • Performance optimization
  • Production guides

Documentation

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for your changes
  4. Ensure all tests pass
  5. Submit a pull request

Roadmap

  • More examples (RAG, web search, multi-agent, Phoenix LiveView)
  • Cost tracking and budgeting
  • Token counting utilities
  • Prompt template management
  • Response and embedding caching
  • Production deployment guide
  • Additional persistence backends (Postgres, Mnesia)
  • Performance benchmarks
  • Prompt caching optimization (OpenRouter native)
  • More model provider support (Ollama, local models)
  • Circuit breaker and rate limiting utilities

License

MIT License - see LICENSE for details

Credits

Inspired by:

Built with:

Support

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Elixir 99.2%
  • Other 0.8%