Skip to content

ariel-frischer/kb

Repository files navigation

kb (knowledge base)

Python 3.12+ License: MIT

CLI RAG tool for your docs. Index 30+ document formats (markdown, PDF, DOCX, EPUB, HTML, ODT, RTF, plain text, email, and more), hybrid search (semantic + keyword), ask questions and get sourced answers. Built on sqlite-vec.

Features

  • Hybrid search — vector similarity + FTS5 keyword search, fused with Reciprocal Rank Fusion (with rank bonuses)
  • HyDE best-of-two — generates a hypothetical answer passage, embeds both it and the raw query, keeps whichever vec result set is better (local via transformers or LLM API; enabled by default, can only help never hurt)
  • Keyword-only searchkb fts for instant BM25 results with zero API cost (truncated filepath matches weighted 10x, headings 2x)
  • Heading-aware chunking — markdown split by heading hierarchy, each chunk carries ancestry
  • Incremental indexing — content-hash per chunk, only re-embeds changes
  • Query expansion — generates keyword synonyms (for FTS) and semantic rephrasings (for vector search) via local FLAN-T5 or LLM, fuses all result lists with multi-list weighted RRF (--expand)
  • Rerankingask over-fetches candidates, reranks by relevance (local cross-encoder or LLM), keeps the best
  • Pre-search filters — file globs, document type, tags, date ranges, keyword inclusion/exclusion
  • Document tagging — manual tags via kb tag, auto-parsed from markdown frontmatter
  • Similar documents — find related docs using stored embeddings (no API call)
  • 30+ formats — markdown, PDF, DOCX, PPTX, XLSX, EPUB, HTML, ODT, ODS, ODP, RTF, email (.eml), subtitles (.srt/.vtt), and plain text variants (.txt, .rst, .org, .csv, .json, .yaml, .tex, etc.)
  • Optional code indexing — set index_code = true to also index source code files (.py, .js, .ts, .go, .rs, etc.)
  • Local or API embeddings — local via ibm-granite/granite-embedding-english-r2 (sentence-transformers, no API cost, fully offline, auto-detected dims) or OpenAI API — config-driven switch
  • Pluggable chunking — uses chonkie when available, regex fallback otherwise
  • MCP server — expose kb as tools for Claude Desktop, Claude Code, and other MCP clients

Install

# One-liner (installs uv if needed)
curl -LsSf https://github.com/ariel-frischer/kb/raw/main/install.sh | sh

# Or with uv directly (all optional deps: PDF, Office, RTF, chunking)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[all]"

# Minimal (markdown, HTML, plain text, email, EPUB, ODT — no extra deps)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" kb

# Pick extras individually
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[pdf]"       # + PDF
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[office]"    # + DOCX, PPTX, XLSX
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[rtf]"       # + RTF
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[local-embed]" # + local embeddings (Granite R2, no API cost)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[rerank]"    # + local cross-encoder reranking
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[expand]"    # + local query expansion (FLAN-T5)
uv tool install --from "git+https://github.com/ariel-frischer/kb.git" "kb[local-llm]" # + local HyDE generation (transformers + torch)

Runs fully local — no API keys required. Set embed_method = "local" in config (see Configuration) and use local backends for HyDE (hyde_method = "local"), reranking (rerank_method = "cross-encoder"), and query expansion (expand_method = "local"). Only kb ask needs an LLM for the final answer — point it at a local model via Ollama or similar.

For cloud, the defaults work with any OpenAI-compatible API. Set OPENAI_API_KEY in your environment (or in ~/.config/kb/secrets.toml). Recommended cloud models: text-embedding-3-small for embeddings, gpt-4o-mini for chat/ask. Works with any provider that speaks the OpenAI API — set OPENAI_BASE_URL to point at Ollama, LiteLLM, vLLM, etc.

Quickstart

# 1. Initialize (global — indexes across repos/folders)
kb init

# 2. Add source directories
kb add ~/notes ~/docs ~/repos/my-project/docs

# 3. Index
kb index

# 4. Search (hybrid: semantic + keyword)
kb search "deployment patterns"

# 5. Quick keyword search (instant, no API cost)
kb fts "deployment patterns"

# 6. Ask (RAG: search → rerank → answer)
kb ask "what are the recommended deployment patterns?"

# 7. List indexed documents
kb list

# 8. Check what's indexed
kb stats

Commands

kb init                        Create global config (~/.config/kb/)
kb init --project              Create project-local .kb.toml in current directory
kb add <dir> [dir...]          Add source directories
kb remove <dir> [dir...]       Remove source directories
kb sources                     List configured sources
kb index [DIR...]              Index sources from config (or explicit dirs)
kb search "query" [k] [--threshold N] [--expand] [--json|--csv|--md]  Hybrid search (default k=5)
kb fts "query" [k] [--json|--csv|--md]            Keyword-only search (instant, no API cost)
kb ask "question" [k] [--threshold N] [--expand] [--json|--csv|--md]  RAG answer (default k=8, BM25 shortcut when confident)
kb list                        Summary of indexed documents by type
kb list --full                 List every indexed document with metadata
kb similar <file> [k]          Find similar documents (no API call, default k=10)
kb tag <file> tag1 [tag2...]   Add tags to a document
kb untag <file> tag1 [tag2...]  Remove tags from a document
kb tags                        List all tags with document counts
kb stats                       Show index stats + capabilities
kb reset                       Drop DB and start fresh
kb version                     Show version (also: kb v, kb --version)
kb mcp                         Start MCP server (for Claude Desktop / AI agents)
kb completion <shell>          Output shell completions (zsh, bash, fish)

Shell completions

# Zsh (add to ~/.zshrc)
eval "$(kb completion zsh)"

# Bash (add to ~/.bashrc)
eval "$(kb completion bash)"

# Fish (add to ~/.config/fish/config.fish)
kb completion fish | source

Configuration

Global mode (default)

kb init creates ~/.config/kb/config.toml. Database lives at ~/.local/share/kb/kb.db. Sources are absolute paths, managed with kb add / kb remove.

Project mode

kb init --project creates .kb.toml in the current directory (found by walking up from cwd, like .gitignore). Database lives at ~/.local/share/kb/projects/<hash>/kb.db — no database files in the project directory. Sources are relative to the config file. Project config takes precedence over global when both exist.

Config format

# Sources (absolute paths in global mode, relative in project mode)
sources = [
    "/home/user/notes",
    "/home/user/docs",
]

# All optional — defaults shown
# embed_method = "openai"  # "openai" (API) or "local" (sentence-transformers, no API cost)
# embed_model = "text-embedding-3-small"
# embed_dims = 1536
# local_embed_model = "ibm-granite/granite-embedding-english-r2"  # or "Snowflake/snowflake-arctic-embed-m-v1.5"
# chat_model = "gpt-4o-mini"
# max_chunk_chars = 2000
# search_threshold = 0.001      # min cosine similarity for `kb search` (also --threshold flag)
# ask_threshold = 0.001         # min cosine similarity for `kb ask` (also --threshold flag)
# rerank_fetch_k = 20
# rerank_top_k = 5
# rerank_method = "llm"     # "llm" (RankGPT) or "cross-encoder" (local, no API cost)
# cross_encoder_model = "cross-encoder/ms-marco-MiniLM-L-6-v2"
# hyde_enabled = true       # generate hypothetical passage before vector search
# hyde_model = ""           # LLM for HyDE ("" = use chat_model)
# hyde_method = "llm"      # "llm" (OpenAI API) or "local" (transformers, no API cost)
# hyde_local_model = "Qwen/Qwen3-0.6B"  # HF model for local HyDE method
# query_expand = false     # generate keyword + semantic query expansions (also --expand flag)
# expand_method = "local"  # "local" (FLAN-T5) or "llm" (OpenAI API)
# expand_model = "google/flan-t5-small"  # model for local expand method
# bm25_shortcut_min = 0.85 # min normalized BM25 for ask shortcut
# bm25_shortcut_gap = 0.02 # min gap vs second doc for ask shortcut
# index_code = false       # set true to also index source code files

.kbignore

Drop a .kbignore in any source directory to exclude files from indexing. Uses fnmatch glob syntax with # comments.

Lookup order: checks <source-dir>/.kbignore, then <source-dir>/../.kbignore (first found wins).

# Skip directories (trailing slash)
drafts/
.obsidian/
node_modules/

# Skip file patterns
*.draft.md
WIP-*
CHANGELOG.md

See docs/kbignore.md for common patterns by use case.

secrets.toml

Optionally store secrets in ~/.config/kb/secrets.toml instead of environment variables:

openai_api_key = "sk-..."
# For Ollama / other providers:
# openai_base_url = "http://localhost:11434/v1"
# openai_api_key = "unused"

Keys are loaded as uppercase environment variables. Existing env vars take precedence.

Search Filters

Add inline with your query — stripped before embedding:

kb search 'file:articles/*.md cost optimization'
kb search 'type:pdf machine learning'
kb search 'tag:python tutorial basics'
kb search 'dt>"2026-02-01" recent developments'
kb search '+"docker" -"kubernetes" container setup'
kb ask 'file:briefs/*.pdf dt>"2026-02-13" what are the costs?'
Filter Syntax Example
File glob file:<pattern> file:articles/*.md
Document type type:<type> type:markdown, type:pdf
Tag tag:<name> tag:python
After date dt>"YYYY-MM-DD" dt>"2026-02-01"
Before date dt<"YYYY-MM-DD" dt<"2026-02-14"
Must contain +"keyword" +"docker"
Must not contain -"keyword" -"kubernetes"

Tags

Tag documents manually or let kb index auto-parse tags from markdown frontmatter:

---
tags: [python, tutorial]
---
kb tag docs/guide.md python tutorial   # add tags manually
kb untag docs/guide.md tutorial        # remove a tag
kb tags                                # list all tags with counts
kb search 'tag:python basics'          # filter by tag in search

Supported Formats

Always available (no extra deps):

Category Extensions
Markdown .md, .markdown
Plain text .txt, .text, .rst, .org, .log, .csv, .tsv, .json, .yaml, .yml, .toml, .xml, .ini, .cfg, .tex, .latex, .bib, .nfo, .adoc, .asciidoc, .properties
HTML .html, .htm, .xhtml
Subtitles .srt, .vtt
Email .eml
OpenDocument .odt, .ods, .odp
EPUB .epub

Optional (install with extras):

Category Extensions Install
PDF .pdf kb[pdf] or kb[all]
Office .docx, .pptx, .xlsx kb[office] or kb[all]
RTF .rtf kb[rtf] or kb[all]

Code files (opt-in): Set index_code = true in config to also index source code — .py, .js, .ts, .go, .rs, .java, .c, .cpp, and 60+ more extensions.

Run kb stats to see which formats are available in your installation.

How It Works

kb index
  1. Find files matching supported formats (respecting .kbignore)
  2. Extract text (format-specific: markdown, PDF, DOCX, HTML, etc.)
  3. Content-hash check — skip unchanged files
  4. Chunk (chonkie or regex fallback)
  5. Diff chunks by hash — only embed new/changed
  6. Batch embed (local sentence-transformers or OpenAI API)
  7. Store in sqlite-vec (vec0) + FTS5

kb search "query"
  1. Parse filters, strip from query
  2. HyDE best-of-two: embed both raw query + hypothetical passage, keep better vec results
  3. [Expand]: generate keyword synonyms + semantic rephrasings (if --expand)
  4. Vector search (vec0 cosine MATCH) + FTS5 keyword search (original + expansion queries)
  5. Pre-filter by tagged chunk IDs if tag: filter active
  6. Fuse with multi-list weighted RRF (primary 2x, expansions 1x)
  7. Apply remaining filters, display results

kb fts "query"
  1. Parse filters, strip from query
  2. FTS5 keyword search (no embedding, weighted BM25: truncated filepath 10x, heading 2x)
  3. Normalize BM25 scores
  4. Apply filters
  5. Display results (instant, zero API cost)

kb ask "question"
  1. BM25 probe — dedup by document, if top doc is high-confidence, skip to step 7
  2. HyDE best-of-two: embed both raw query + hypothetical passage, keep better vec results
  3. [Expand]: generate keyword synonyms + semantic rephrasings (if --expand)
  4. Same as search (with expansion), but over-fetch 20
  5. Pre-filter by tagged chunk IDs, then apply remaining filters
  6. Rerank -> top 5 (cross-encoder or LLM)
  7. Confidence threshold
  8. LLM generates answer from context

MCP Server

kb includes an MCP server that exposes search and ask as tools for AI agents.

Setup with Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS, ~/.config/Claude/claude_desktop_config.json on Linux):

{
  "mcpServers": {
    "kb": {
      "command": "kb-mcp"
    }
  }
}

Setup with Claude Code

claude mcp add kb kb-mcp

Available tools

Tool Description
kb_search Hybrid semantic + keyword search with inline filters
kb_ask Full RAG pipeline: search + rerank + LLM answer
kb_fts Keyword-only search (no API cost)
kb_similar Find similar documents (no API call)
kb_status Index statistics
kb_list List indexed documents

The MCP server requires the mcp extra: kb[mcp] or kb[all].

Alternatives

Tool What it is Local-only CLI Setup
kb CLI RAG tool — hybrid search + Q&A over 30+ document formats Yes Yes uv tool install, single SQLite file
Khoj Self-hosted AI second brain with web UI, mobile, Obsidian/Emacs plugins Optional No Docker or pip, runs a web server
Reor Desktop note-taking app with auto-linking and local LLM Yes No Electron app, uses LanceDB + Ollama
LlamaIndex Framework for building RAG pipelines Depends No Python library, you build the app
ChromaDB Vector database with simple API Yes No Python library, you build the app
grepai Semantic code search + call graphs, 100% local Yes Yes brew install or curl, uses Ollama/OpenAI embeddings

When to use what:

  • kb — you want a CLI RAG tool that indexes docs (markdown, PDFs, DOCX, EPUB, HTML, and more) and answers questions from them
  • grepai — you want semantic search over code (find by intent, trace call graphs), no RAG
  • Khoj — you want a full-featured app with web UI, phone access, Obsidian integration, and agent capabilities
  • Reor — you want an Obsidian-like desktop editor that auto-links notes using local AI
  • LlamaIndex / ChromaDB — you're building your own RAG pipeline and need libraries, not a finished tool

Contributing

Contributions welcome! Please open an issue first to discuss what you'd like to change.

See docs/DEVELOPMENT.md for setup, architecture, and workflow.

Maintenance

This is a personal tool I've open-sourced. I may or may not respond to issues/PRs. Fork freely.

License

MIT

About

CLI RAG tool - index 30+ document formats, hybrid search, ask questions and get sourced answers.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages