Skip to content

Add Ollama provider with model lifecycle management #24

@djthorpe

Description

@djthorpe

Summary

Implement an Ollama provider (pkg/provider/ollama) for local model inference, including full model lifecycle management (download, load, unload, delete).

Requirements

Core Provider

  • Implement llm.Client interface for the Ollama API (http://localhost:11434 by default)
  • Endpoint configurable via OLLAMA_ENDPOINT environment variable or CLI flag
  • No API key required (local service)
  • Support streaming and non-streaming chat completions
  • Support tool/function calling (Ollama supports this for compatible models)
  • Support embeddings

Model Management

  • Pull/Download: Pull models from the Ollama registry (equivalent to ollama pull)
  • List: List locally available models with size, quantization, and modification date
  • Load: Load a model into memory (warm up for faster inference)
  • Unload: Unload a model from memory to free GPU/RAM
  • Delete: Remove a model from local storage
  • Show: Get model details (parameters, template, license, system prompt)
  • Expose model management operations via the API (new endpoints or extend existing model API)

API Endpoints

  • POST /api/model/pull — pull a model by name/tag
  • POST /api/model/{name}/load — load model into memory
  • POST /api/model/{name}/unload — unload model from memory
  • DELETE /api/model/{name} — delete a model
  • Pull progress should be streamable (SSE or chunked response) for UI feedback

Models

  • Any model available in the Ollama registry (llama, mistral, gemma, phi, qwen, etc.)
  • Model names use the Ollama format: model:tag (e.g. llama3.2:latest, mistral:7b-instruct-q4_0)

Notes

Motivation

Ollama enables local/private model inference with no API costs. Model management is a key differentiator — users need to download, load, and manage models on their GPU servers. This complements the cloud providers (Gemini, Anthropic, Mistral) with self-hosted options.

OpenAI is a major LLM provider and its multi-modal output capabilities (images, audio) will drive the content model to support rich responses across all providers, improving the overall architecture.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions