CLI RAG tool for your docs. Index 30+ document formats (markdown, PDF, DOCX, EPUB, HTML, ODT, RTF, plain text, email, and more), hybrid search (semantic + keyword), ask questions and get sourced answers. Built on sqlite-vec.
- Hybrid search — vector similarity + FTS5 keyword search, fused with Reciprocal Rank Fusion (with rank bonuses)
- HyDE best-of-two — generates a hypothetical answer passage, embeds both it and the raw query, keeps whichever vec result set is better (local via transformers or LLM API; enabled by default, can only help never hurt)
- Keyword-only search —
kb ftsfor instant BM25 results with zero API cost (truncated filepath matches weighted 10x, headings 2x) - Heading-aware chunking — markdown split by heading hierarchy, each chunk carries ancestry
- Incremental indexing — content-hash per chunk, only re-embeds changes
- Query expansion — generates keyword synonyms (for FTS) and semantic rephrasings (for vector search) via local FLAN-T5 or LLM, fuses all result lists with multi-list weighted RRF (
--expand) - Reranking —
askover-fetches candidates, reranks by relevance (local cross-encoder or LLM), keeps the best - Pre-search filters — file globs, document type, tags, date ranges, keyword inclusion/exclusion
- Document tagging — manual tags via
kb tag, auto-parsed from markdown frontmatter - Similar documents — find related docs using stored embeddings (no API call)
- 30+ formats — markdown, PDF, DOCX, PPTX, XLSX, EPUB, HTML, ODT, ODS, ODP, RTF, email (.eml), subtitles (.srt/.vtt), and plain text variants (.txt, .rst, .org, .csv, .json, .yaml, .tex, etc.)
- Optional code indexing — set
index_code = trueto also index source code files (.py, .js, .ts, .go, .rs, etc.) - Local or API embeddings — local via
ibm-granite/granite-embedding-english-r2(sentence-transformers, no API cost, fully offline, auto-detected dims) or OpenAI API — config-driven switch - Pluggable chunking — uses chonkie when available, regex fallback otherwise
- MCP server — expose kb as tools for Claude Desktop, Claude Code, and other MCP clients
# One-liner (installs uv if needed)
curl -LsSf https://github.com/ariel-frischer/kb/raw/main/install.sh | sh
# Or with uv directly (all optional deps: PDF, Office, RTF, chunking)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[all]"
# Minimal (markdown, HTML, plain text, email, EPUB, ODT — no extra deps)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" kb
# Pick extras individually
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[pdf]" # + PDF
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[office]" # + DOCX, PPTX, XLSX
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[rtf]" # + RTF
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[local-embed]" # + local embeddings (Granite R2, no API cost)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[rerank]" # + local cross-encoder reranking
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[expand]" # + local query expansion (FLAN-T5)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[local-llm]" # + local HyDE generation (transformers + torch)Runs fully local — no API keys required. Set embed_method = "local" in config (see Configuration) and use local backends for HyDE (hyde_method = "local"), reranking (rerank_method = "cross-encoder"), and query expansion (expand_method = "local"). Only kb ask needs an LLM for the final answer — point it at a local model via Ollama or similar.
For cloud, the defaults work with any OpenAI-compatible API. Set OPENAI_API_KEY in your environment (or in ~/.config/kb/secrets.toml). Recommended cloud models: text-embedding-3-small for embeddings, gpt-4o-mini for chat/ask. Works with any provider that speaks the OpenAI API — set OPENAI_BASE_URL to point at Ollama, LiteLLM, vLLM, etc.
# 1. Initialize (global — indexes across repos/folders)
kb init
# 2. Add source directories
kb add ~/notes ~/docs ~/repos/my-project/docs
# 3. Index
kb index
# 4. Search (hybrid: semantic + keyword)
kb search "deployment patterns"
# 5. Quick keyword search (instant, no API cost)
kb fts "deployment patterns"
# 6. Ask (RAG: search → rerank → answer)
kb ask "what are the recommended deployment patterns?"
# 7. List indexed documents
kb list
# 8. Check what's indexed
kb statskb init Create global config (~/.config/kb/)
kb init --project Create project-local .kb.toml in current directory
kb add <dir> [dir...] Add source directories
kb remove <dir> [dir...] Remove source directories
kb sources List configured sources
kb index [DIR...] Index sources from config (or explicit dirs)
kb search "query" [k] [--threshold N] [--expand] [--json|--csv|--md] Hybrid search (default k=5)
kb fts "query" [k] [--json|--csv|--md] Keyword-only search (instant, no API cost)
kb ask "question" [k] [--threshold N] [--expand] [--json|--csv|--md] RAG answer (default k=8, BM25 shortcut when confident)
kb list Summary of indexed documents by type
kb list --full List every indexed document with metadata
kb similar <file> [k] Find similar documents (no API call, default k=10)
kb tag <file> tag1 [tag2...] Add tags to a document
kb untag <file> tag1 [tag2...] Remove tags from a document
kb tags List all tags with document counts
kb stats Show index stats + capabilities
kb reset Drop DB and start fresh
kb version Show version (also: kb v, kb --version)
kb mcp Start MCP server (for Claude Desktop / AI agents)
kb completion <shell> Output shell completions (zsh, bash, fish)
# Zsh (add to ~/.zshrc)
eval "$(kb completion zsh)"
# Bash (add to ~/.bashrc)
eval "$(kb completion bash)"
# Fish (add to ~/.config/fish/config.fish)
kb completion fish | sourcekb init creates ~/.config/kb/config.toml. Database lives at ~/.local/share/kb/kb.db. Sources are absolute paths, managed with kb add / kb remove.
kb init --project creates .kb.toml in the current directory (found by walking up from cwd, like .gitignore). Database lives at ~/.local/share/kb/projects/<hash>/kb.db — no database files in the project directory. Sources are relative to the config file. Project config takes precedence over global when both exist.
# Sources (absolute paths in global mode, relative in project mode)
sources = [
"/home/user/notes",
"/home/user/docs",
]
# All optional — defaults shown
# embed_method = "openai" # "openai" (API) or "local" (sentence-transformers, no API cost)
# embed_model = "text-embedding-3-small"
# embed_dims = 1536
# local_embed_model = "ibm-granite/granite-embedding-english-r2" # or "Snowflake/snowflake-arctic-embed-m-v1.5"
# chat_model = "gpt-4o-mini"
# max_chunk_chars = 2000
# search_threshold = 0.001 # min cosine similarity for `kb search` (also --threshold flag)
# ask_threshold = 0.001 # min cosine similarity for `kb ask` (also --threshold flag)
# rerank_fetch_k = 20
# rerank_top_k = 5
# rerank_method = "llm" # "llm" (RankGPT) or "cross-encoder" (local, no API cost)
# cross_encoder_model = "cross-encoder/ms-marco-MiniLM-L-6-v2"
# hyde_enabled = true # generate hypothetical passage before vector search
# hyde_model = "" # LLM for HyDE ("" = use chat_model)
# hyde_method = "llm" # "llm" (OpenAI API) or "local" (transformers, no API cost)
# hyde_local_model = "Qwen/Qwen3-0.6B" # HF model for local HyDE method
# query_expand = false # generate keyword + semantic query expansions (also --expand flag)
# expand_method = "local" # "local" (FLAN-T5) or "llm" (OpenAI API)
# expand_model = "google/flan-t5-small" # model for local expand method
# bm25_shortcut_min = 0.85 # min normalized BM25 for ask shortcut
# bm25_shortcut_gap = 0.02 # min gap vs second doc for ask shortcut
# index_code = false # set true to also index source code filesDrop a .kbignore in any source directory to exclude files from indexing. Uses fnmatch glob syntax with # comments.
Lookup order: checks <source-dir>/.kbignore, then <source-dir>/../.kbignore (first found wins).
# Skip directories (trailing slash)
drafts/
.obsidian/
node_modules/
# Skip file patterns
*.draft.md
WIP-*
CHANGELOG.md
See docs/kbignore.md for common patterns by use case.
Optionally store secrets in ~/.config/kb/secrets.toml instead of environment variables:
openai_api_key = "sk-..."
# For Ollama / other providers:
# openai_base_url = "http://localhost:11434/v1"
# openai_api_key = "unused"Keys are loaded as uppercase environment variables. Existing env vars take precedence.
Add inline with your query — stripped before embedding:
kb search 'file:articles/*.md cost optimization'
kb search 'type:pdf machine learning'
kb search 'tag:python tutorial basics'
kb search 'dt>"2026-02-01" recent developments'
kb search '+"docker" -"kubernetes" container setup'
kb ask 'file:briefs/*.pdf dt>"2026-02-13" what are the costs?'| Filter | Syntax | Example |
|---|---|---|
| File glob | file:<pattern> |
file:articles/*.md |
| Document type | type:<type> |
type:markdown, type:pdf |
| Tag | tag:<name> |
tag:python |
| After date | dt>"YYYY-MM-DD" |
dt>"2026-02-01" |
| Before date | dt<"YYYY-MM-DD" |
dt<"2026-02-14" |
| Must contain | +"keyword" |
+"docker" |
| Must not contain | -"keyword" |
-"kubernetes" |
Tag documents manually or let kb index auto-parse tags from markdown frontmatter:
---
tags: [python, tutorial]
---kb tag docs/guide.md python tutorial # add tags manually
kb untag docs/guide.md tutorial # remove a tag
kb tags # list all tags with counts
kb search 'tag:python basics' # filter by tag in searchAlways available (no extra deps):
| Category | Extensions |
|---|---|
| Markdown | .md, .markdown |
| Plain text | .txt, .text, .rst, .org, .log, .csv, .tsv, .json, .yaml, .yml, .toml, .xml, .ini, .cfg, .tex, .latex, .bib, .nfo, .adoc, .asciidoc, .properties |
| HTML | .html, .htm, .xhtml |
| Subtitles | .srt, .vtt |
.eml |
|
| OpenDocument | .odt, .ods, .odp |
| EPUB | .epub |
Optional (install with extras):
| Category | Extensions | Install |
|---|---|---|
.pdf |
kb[pdf] or kb[all] |
|
| Office | .docx, .pptx, .xlsx |
kb[office] or kb[all] |
| RTF | .rtf |
kb[rtf] or kb[all] |
Code files (opt-in): Set index_code = true in config to also index source code — .py, .js, .ts, .go, .rs, .java, .c, .cpp, and 60+ more extensions.
Run kb stats to see which formats are available in your installation.
kb index
1. Find files matching supported formats (respecting .kbignore)
2. Extract text (format-specific: markdown, PDF, DOCX, HTML, etc.)
3. Content-hash check — skip unchanged files
4. Chunk (chonkie or regex fallback)
5. Diff chunks by hash — only embed new/changed
6. Batch embed (local sentence-transformers or OpenAI API)
7. Store in sqlite-vec (vec0) + FTS5
kb search "query"
1. Parse filters, strip from query
2. HyDE best-of-two: embed both raw query + hypothetical passage, keep better vec results
3. [Expand]: generate keyword synonyms + semantic rephrasings (if --expand)
4. Vector search (vec0 cosine MATCH) + FTS5 keyword search (original + expansion queries)
5. Pre-filter by tagged chunk IDs if tag: filter active
6. Fuse with multi-list weighted RRF (primary 2x, expansions 1x)
7. Apply remaining filters, display results
kb fts "query"
1. Parse filters, strip from query
2. FTS5 keyword search (no embedding, weighted BM25: truncated filepath 10x, heading 2x)
3. Normalize BM25 scores
4. Apply filters
5. Display results (instant, zero API cost)
kb ask "question"
1. BM25 probe — dedup by document, if top doc is high-confidence, skip to step 7
2. HyDE best-of-two: embed both raw query + hypothetical passage, keep better vec results
3. [Expand]: generate keyword synonyms + semantic rephrasings (if --expand)
4. Same as search (with expansion), but over-fetch 20
5. Pre-filter by tagged chunk IDs, then apply remaining filters
6. Rerank -> top 5 (cross-encoder or LLM)
7. Confidence threshold
8. LLM generates answer from context
kb includes an MCP server that exposes search and ask as tools for AI agents.
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS, ~/.config/Claude/claude_desktop_config.json on Linux):
{
"mcpServers": {
"kb": {
"command": "kb-mcp"
}
}
}claude mcp add kb kb-mcp| Tool | Description |
|---|---|
kb_search |
Hybrid semantic + keyword search with inline filters |
kb_ask |
Full RAG pipeline: search + rerank + LLM answer |
kb_fts |
Keyword-only search (no API cost) |
kb_similar |
Find similar documents (no API call) |
kb_status |
Index statistics |
kb_list |
List indexed documents |
The MCP server requires the mcp extra: kb[mcp] or kb[all].
| Tool | What it is | Local-only | CLI | Setup |
|---|---|---|---|---|
| kb | CLI RAG tool — hybrid search + Q&A over 30+ document formats | Yes | Yes | uv tool install, single SQLite file |
| Khoj | Self-hosted AI second brain with web UI, mobile, Obsidian/Emacs plugins | Optional | No | Docker or pip, runs a web server |
| Reor | Desktop note-taking app with auto-linking and local LLM | Yes | No | Electron app, uses LanceDB + Ollama |
| LlamaIndex | Framework for building RAG pipelines | Depends | No | Python library, you build the app |
| ChromaDB | Vector database with simple API | Yes | No | Python library, you build the app |
| grepai | Semantic code search + call graphs, 100% local | Yes | Yes | brew install or curl, uses Ollama/OpenAI embeddings |
When to use what:
- kb — you want a CLI RAG tool that indexes docs (markdown, PDFs, DOCX, EPUB, HTML, and more) and answers questions from them
- grepai — you want semantic search over code (find by intent, trace call graphs), no RAG
- Khoj — you want a full-featured app with web UI, phone access, Obsidian integration, and agent capabilities
- Reor — you want an Obsidian-like desktop editor that auto-links notes using local AI
- LlamaIndex / ChromaDB — you're building your own RAG pipeline and need libraries, not a finished tool
Contributions welcome! Please open an issue first to discuss what you'd like to change.
See docs/DEVELOPMENT.md for setup, architecture, and workflow.
This is a personal tool I've open-sourced. I may or may not respond to issues/PRs. Fork freely.