Langfuse + Ollama (Local LLM Evaluation Setup)

This repository provides a local, free, self-hosted setup for LLM evaluation using Langfuse and Ollama.

The goal is to:

Experiment with prompts
Compare multiple LLMs
Run evaluations (scores, LLM-as-judge, datasets)
Keep everything local and offline

This setup uses the official Langfuse OSS (version v3.148.0) with a local Ollama instance via the OpenAI-compatible API.

No Langfuse code changes, forks, or plugins are required.

This follows the recommended integration approach documented in: https://github.com/langfuse/langfuse

PREREQUISITES

Docker Desktop (WSL2 enabled on Windows)
16 GB RAM recommended
Optional: NVIDIA GPU (for faster inference)

STARTING THE STACK

From the project root directory, run:

docker compose up -d

This starts:

Langfuse (UI + workers)
PostgreSQL, ClickHouse, Redis, MinIO
Ollama (CPU-only by default)

Tip: the docker-compose.yml file includes an ollama service by default. If you already have Ollama installed natively on your PC and want to use that instead, simply comment out the ollama service block in docker-compose.yml (see example further below) and make sure your native Ollama is running and listening on port 11434. Only one instance may bind to the port at a time.

ACCESS LANGFUSE

Open your browser:

http://localhost:3000

First-time setup:

Create a user
Create an organization
Create a project

All Langfuse data is stored locally in Docker volumes.

OLLAMA (LOCAL LLM RUNTIME)

Ollama runs inside Docker and exposes an OpenAI-compatible API.

If you're using the container, other services (like Langfuse) can reach it via the hostname http://ollama:11434/v1. When using a native installation, point at http://host.docker.internal:11434/v1 instead and make sure the Docker service is commented out or stopped.

You can interact with Ollama in TWO ways:

Simple host command (recommended for most users)
Explicit Docker command (useful if you also have native Ollama)

Both are shown below.

LIST AVAILABLE MODELS

Simple command: ollama list

Docker-specific command: docker exec -it ollama ollama list

NOTE:

If you have native Ollama installed AND Docker Ollama running, the docker command guarantees you are talking to the container.

PULL A MODEL (EXAMPLES)

Recommended models for evaluation:

Gemma 3 (4B) – primary eval / judge: docker exec -it ollama ollama pull gemma3:4b

Mistral 7B – comparison model: docker exec -it ollama ollama pull mistral

LLaMA 3.2 (3B) – reasoning comparison: docker exec -it ollama ollama pull llama3.2:3b

After pulling, verify:

ollama list or docker exec -it ollama ollama list

CONNECT OLLAMA TO LANGFUSE (UI)

In Langfuse UI:

Settings → LLM Connections → Add LLM Connection

Fill in exactly:

LLM adapter: openai

Provider name: OpenAI

API Key: ollama

API Base URL: http://ollama:11434/v1

Alternative: if you run into networking issues from within containers, use http://host.docker.internal:11434/v1 which resolves back to your host.
(use http://localhost:11434/v1 when pointing Langfuse at a native Ollama install.)

Custom models: Add model name as in Ollama: For example, gemma3:1b

Save the connection.

You can now select Ollama models in:

Prompt Playground
Evaluations
Datasets
LLM-as-Judge

NOTE

Ollama must be running and reachable from Langfuse
The model must already be pulled in Ollama
Model names must match exactly (case-sensitive)
This works for both CPU-only and GPU-enabled Ollama
No external scripts or SDKs are required for UI usage

PROMPT PLAYGROUND TRACING

By default, Prompt Playground runs are NOT traced.

To enable tracing:

Go to Settings → Tracing
Enable "Trace Prompt Playground runs"
Save

This allows:

Viewing traces
Running evaluations
Adding runs to datasets

OPTIONAL: ENABLE GPU SUPPORT (NVIDIA ONLY)

GPU support is DISABLED by default.

Requirements:

NVIDIA GPU
Latest NVIDIA drivers
Docker Desktop (WSL2)
NVIDIA Container Toolkit installed

Steps:

Open docker-compose.yml
Find the ollama service
Uncomment the GPU block
Restart Docker services:

docker compose down docker compose up -d

VERIFY GPU USAGE

Run:

docker exec -it ollama ollama ps

If GPU is enabled, GPU-related information will appear. If running on CPU, this command still works but shows CPU usage.

IMPORTANT NOTES

CPU-only mode is sufficient for learning and evaluations
GPU is recommended for larger models (7B+)
Models are NOT auto-downloaded
Each user chooses which models to pull
No prompts or data leave your machine

REPOSITORY PHILOSOPHY

Infrastructure is versioned in Git
Runtime data and models are local
No secrets are committed
Users have freedom to experiment
This is a learning & evaluation environment, not production

COMPATIBILITY

Langfuse OSS: v3.148.0
Ollama: OpenAI-compatible API
Deployment: Local Docker
Licensing: Fully open source

Name		Name	Last commit message	Last commit date
Latest commit History 6,128 Commits
.claude		.claude
.cursor		.cursor
.devcontainer		.devcontainer
.github		.github
.husky		.husky
ee		ee
fern		fern
packages		packages
patches		patches
scripts		scripts
web		web
worker		worker
.codespellrc		.codespellrc
.dockerignore		.dockerignore
.env.dev-azure.example		.env.dev-azure.example
.env.dev-redis-cluster.example		.env.dev-redis-cluster.example
.env.dev.example		.env.dev.example
.env.prod.example		.env.prod.example
.env.test.example		.env.test.example
.gitignore		.gitignore
.npmrc		.npmrc
.nvmrc		.nvmrc
.prettierignore		.prettierignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.cn.md		README.cn.md
README.ja.md		README.ja.md
README.kr.md		README.kr.md
README.md		README.md
REVIEW.md		REVIEW.md
SECURITY.md		SECURITY.md
docker-compose.build.yml		docker-compose.build.yml
docker-compose.dev-azure.yml		docker-compose.dev-azure.yml
docker-compose.dev-redis-cluster.yml		docker-compose.dev-redis-cluster.yml
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
prettier.config.cjs		prettier.config.cjs
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Langfuse + Ollama (Local LLM Evaluation Setup)

PREREQUISITES

STARTING THE STACK

ACCESS LANGFUSE

OLLAMA (LOCAL LLM RUNTIME)

LIST AVAILABLE MODELS

PULL A MODEL (EXAMPLES)

CONNECT OLLAMA TO LANGFUSE (UI)

PROMPT PLAYGROUND TRACING

OPTIONAL: ENABLE GPU SUPPORT (NVIDIA ONLY)

VERIFY GPU USAGE

IMPORTANT NOTES

REPOSITORY PHILOSOPHY

COMPATIBILITY

END

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Langfuse + Ollama (Local LLM Evaluation Setup)

PREREQUISITES

STARTING THE STACK

ACCESS LANGFUSE

OLLAMA (LOCAL LLM RUNTIME)

LIST AVAILABLE MODELS

PULL A MODEL (EXAMPLES)

CONNECT OLLAMA TO LANGFUSE (UI)

PROMPT PLAYGROUND TRACING

OPTIONAL: ENABLE GPU SUPPORT (NVIDIA ONLY)

VERIFY GPU USAGE

IMPORTANT NOTES

REPOSITORY PHILOSOPHY

COMPATIBILITY

END

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages