GitHub - ory/lumen: Cut Claude Code costs and latency with precise, local semantic search across your entire codebase. Best for teams working on large code bases. Enterprise-ready and compliant. No subscription or payment or sign up needed.

Claude reads entire files to find what it needs. Lumen gives it a map.

Lumen is a 100% local semantic code search engine for AI coding agents. No API keys, no cloud, no external database, just open-source embedding models (Ollama or LM Studio), SQLite, and your CPU. A single static binary, no runtime required.

The payoff is measurable: 2.1–2.3× faster task completion and 63–81% cheaper API costs, with reproducible benchmarks and answer quality that wins every blind comparison.

	With lumen	Baseline (no MCP)
Task completion	2.1-2.3x faster	baseline
API cost	63-81% cheaper	baseline
Answer quality (blind judge)	5/5 wins	0/5 wins

Demo

Claude Code asking about the Prometheus codebase. Lumen's semantic_search finds the relevant code without reading entire files.

Quick Start

Prerequisites:

Ollama installed and running, then pull the default embedding model:
```
ollama pull ordis/jina-embeddings-v2-base-code
```
Claude Code installed

Install:

/plugin marketplace add ory/claude-plugins
/plugin install lumen@ory

That's it. On first session start, Lumen:

Downloads the binary automatically from the latest GitHub release
Indexes your project in the background using Merkle tree change detection
Registers a semantic_search MCP tool that Claude uses automatically

Two skills are also available: /lumen:doctor (health check) and /lumen:reindex (forced re-indexing).

What You Get

Semantic vector search — Claude finds relevant functions, types, and modules by meaning, not keyword matching
Auto-indexing — indexes on session start, only re-processes changed files via Merkle tree diffing
Incremental updates — re-indexes only what changed; large codebases re-index in seconds after the first run
12 language families — Go, Python, TypeScript, JavaScript, Rust, Ruby, Java, PHP, C/C++, Markdown, YAML, JSON
Zero cloud — embeddings stay on your machine; no data leaves your network
Ollama and LM Studio — works with either local embedding backend

How It Works

Lumen sits between your codebase and Claude as an MCP server. When a session starts, it walks your project and builds a Merkle tree over file hashes: only changed files get re-chunked and re-embedded. Each file is split into semantic chunks (functions, types, methods) using Go's native AST or tree-sitter grammars for other languages. Chunks are embedded and stored in SQLite + sqlite-vec using cosine-distance KNN for retrieval.

Files → semantic chunks → vector embeddings → SQLite/sqlite-vec → KNN search

When Claude needs to understand code, it calls semantic_search instead of reading entire files. The index is stored outside your repo (~/.local/share/lumen/<hash>/index.db), keyed by project path and model name — different models never share an index.

Benchmarks

Key results (Ollama, jina-embeddings-v2-base-code, Golang fixture):

Model	Speedup	Cost Savings	Quality
Sonnet 4.6	2.2x faster	63% cheaper	5/5 MCP wins
Opus 4.6	2.1x faster	81% cheaper	5/5 MCP wins

Results hold across LM Studio (nomic-embed-code) and across Go, Python, and TypeScript in extended multi-model benchmarks.

See docs/BENCHMARKS.md for full speed/cost tables, answer quality breakdowns, per-language results across 4 embedding models, and reproduce instructions.

Supported Languages

Supports 12 language families with semantic chunking:

Language	Parser	Extensions	Status
Go	Native AST	`.go`	Optimized: 3.8x faster, 90% cheaper
Python	tree-sitter	`.py`	Tested: 1.8x faster, 72% cheaper
TypeScript / TSX	tree-sitter	`.ts`, `.tsx`	Tested: 1.4x faster, 48% cheaper
JavaScript / JSX	tree-sitter	`.js`, `.jsx`, `.mjs`	Supported
Rust	tree-sitter	`.rs`	Supported
Ruby	tree-sitter	`.rb`	Supported
Java	tree-sitter	`.java`	Supported
PHP	tree-sitter	`.php`	Supported
C / C++	tree-sitter	`.c`, `.h`, `.cpp`, `.cc`, `.cxx`, `.hpp`	Supported
Markdown / MDX	tree-sitter	`.md`, `.mdx`	Supported
YAML	tree-sitter	`.yaml`, `.yml`	Supported
JSON	tree-sitter	`.json`	Supported

Go uses the native Go AST parser for the most precise chunks. All other languages use tree-sitter grammars.

Note: Golang is the best-supported language. Other languages work via tree-sitter but may benefit from improved chunking strategies.

Configuration

All configuration is via environment variables:

Variable	Default	Description
`LUMEN_EMBED_MODEL`	see note ¹	Embedding model (must be in registry)
`LUMEN_BACKEND`	`ollama`	Embedding backend (`ollama` or `lmstudio`)
`OLLAMA_HOST`	`localhost:11434`	Ollama server URL
`LM_STUDIO_HOST`	`localhost:1234`	LM Studio server URL
`LUMEN_MAX_CHUNK_TOKENS`	`512`	Max tokens per chunk before splitting

¹ ordis/jina-embeddings-v2-base-code (Ollama), nomic-ai/nomic-embed-code-GGUF (LM Studio)

Supported Embedding Models

Dimensions and context length are configured automatically per model:

Model	Backend	Dims	Context	Recommended
`ordis/jina-embeddings-v2-base-code`	Ollama	768	8192	Best default — lowest cost, no over-retrieval
`qwen3-embedding:8b`	Ollama	4096	40960	Best quality — strongest dominance (7/9 wins), very slow indexing
`nomic-ai/nomic-embed-code-GGUF`	LM Studio	3584	8192	Usable — good quality, but TypeScript over-retrieval raises costs
`qwen3-embedding:4b`	Ollama	2560	40960	Not recommended — highest costs, severe TypeScript over-retrieval
`nomic-embed-text`	Ollama	768	8192	Untested
`qwen3-embedding:0.6b`	Ollama	1024	32768	Untested
`all-minilm`	Ollama	384	512	Untested

Switching models creates a separate index automatically. The model name is part of the database path hash, so different models never collide.

Database Location

Index databases are stored outside your project:

~/.local/share/lumen/<hash>/index.db

Where <hash> is derived from the absolute project path and embedding model name. No files are added to your repo, no .gitignore modifications needed.

You can safely delete the entire lumen directory to clear all indexes, or use lumen purge to do it automatically.

CLI Reference

Download the binary from the GitHub releases page or let the plugin install it automatically.

lumen help

Troubleshooting

Ollama not running / "connection refused"

Start Ollama and verify the model is pulled:

ollama serve
ollama pull ordis/jina-embeddings-v2-base-code

Run /lumen:doctor inside Claude Code to confirm connectivity.

Stale index after large refactor

Run /lumen:reindex inside Claude Code to force a full re-index, or:

lumen purge && lumen index .

Switching embedding models

Set LUMEN_EMBED_MODEL to a model from the supported table above. Each model gets its own database; the old index is not deleted automatically.

Slow first indexing

The first run embeds every file. Subsequent runs only process changed files (typically a few seconds). For large projects (100k+ lines), first indexing can take several minutes — this is a one-time cost.

Development

git clone https://github.com/ory/lumen.git
cd lumen

# Build (CGO required for sqlite-vec)
make build

# Run tests
make test

# Run linter
make lint

# Load as a Claude Code plugin from source
make plugin-dev

See CLAUDE.md for architecture details, design decisions, and contribution guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
.claude-plugin		.claude-plugin
.github		.github
bench-results		bench-results
cmd		cmd
docs		docs
hooks		hooks
internal		internal
scripts		scripts
skills		skills
testdata		testdata
.gitattributes		.gitattributes
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
.mcp.json		.mcp.json
.prettierignore		.prettierignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
bench-mcp.sh		bench-mcp.sh
bench-models.sh		bench-models.sh
e2e_cli_test.go		e2e_cli_test.go
e2e_lang_test.go		e2e_lang_test.go
e2e_test.go		e2e_test.go
go.mod		go.mod
go.sum		go.sum
main.go		main.go
release-please-config.json		release-please-config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo

Quick Start

What You Get

How It Works

Benchmarks

Supported Languages

Configuration

Supported Embedding Models

Database Location

CLI Reference

Troubleshooting

Development

About

Uh oh!

Releases 13

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Demo

Quick Start

What You Get

How It Works

Benchmarks

Supported Languages

Configuration

Supported Embedding Models

Database Location

CLI Reference

Troubleshooting

Development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Uh oh!

Contributors

Uh oh!

Languages