LLM as Knowledge Compiler — Karpathy's LLM Knowledge Base pattern, optimized for PARA Obsidian vaults
Inspired by Karpathy's llm-wiki — the idea that an LLM should act as a knowledge compiler, not a search engine. Rather than re-retrieving and re-summarizing raw notes on every query, knowledge is compiled once into structured wiki-style articles, then maintained incrementally as new information arrives.
This plugin applies that pattern to PARA-structured Obsidian vaults.
See examples/ for a fictional initialized vault snapshot, including indexes, a project hub, a change log, and optional query telemetry.
If you already have a PARA vault with hundreds of notes, Claude Code can read them — but every conversation starts from scratch. Each query re-reads files, re-discovers structure, and burns tokens figuring out what's where.
This plugin fixes that by adding a persistent knowledge layer on top of your existing vault:
The problem without a KB:
User: "What do I know about distributed systems?"
Claude: reads 30+ files → 15K tokens → synthesizes answer → forgotten next session
User: (asks again next week)
Claude: reads the same 30+ files again → 15K tokens again
With this plugin:
/kb-ingest → new notes classified, linked, indexed (once)
/kb-query "distributed systems" → reads _index.md (50 tokens) → targets 2 files (400 tokens) → done
| Already have | This plugin adds |
|---|---|
| Folders organized by PARA | _index.md per category — Claude reads 10 lines instead of scanning 100 files |
| Notes with some tags | Tag convention detection + consistency enforcement across vault |
| Manual wikilinks | Auto-generated wikilinks on ingest — first occurrence of known terms linked automatically |
| CLAUDE.md with basic rules | Full vault schema — structure, KB operations, tag system, navigation strategies |
| Raw notes in Inbox | One-command classification + move + metadata + linking |
| Unmeasured ad-hoc searches | Optional query telemetry — request-level route, document count, tool/time/token signals for later analysis |
| Operation | Without plugin | With plugin |
|---|---|---|
| "What projects am I working on?" | Scan all project folders ~3K tokens | Read _index.md ~50 tokens |
| "Find everything about topic X" | Grep entire vault ~5K tokens | Tag search via index ~200 tokens |
| Ingest a new document | Manual: move, tag, link, update index | /kb-ingest — all automated, ~500 tokens |
| Weekly health check | Not possible | /kb-lint — orphans, broken links, stale content |
The key insight: indexes are cheap to read, and they tell Claude exactly where to look. Instead of scanning your whole vault every time, Claude reads a 10-line index, picks the right folder, and reads only what's needed.
Inbox/ ← raw capture (fleeting notes, clippings, meeting notes)
PARA/ ← compiled knowledge wiki (Projects / Areas / Resources / Archives)
CLAUDE.md ← vault schema (topics, conventions, wikilink vocabulary)
- Inbox is the staging area. Raw, unprocessed, low-friction.
- PARA is the knowledge base. LLM-compiled, structured, cross-linked.
- CLAUDE.md is the schema. Tells the LLM what topics exist, how notes are organized, what wikilinks are canonical.
| Operation | What it does |
|---|---|
| Ingest | Takes raw Inbox notes, extracts knowledge, merges into PARA wiki pages |
| Query | Answers questions by reading compiled PARA pages (not raw notes) |
| Lint | Audits knowledge base for gaps, broken wikilinks, stale content |
| RAG (Retrieval) | KB (Compilation) |
|---|---|
| Retrieves chunks on each query | Knowledge compiled once, maintained incrementally |
| Quality varies with retrieval precision | Consistent quality — LLM synthesizes on ingest |
| No persistent synthesis | Synthesis is durable; query is fast |
| Good for large document corpuses | Good for personal knowledge that evolves over time |
For a personal PARA vault, the KB pattern wins: your notes are small enough to compile, and the value compounds as the KB grows more interconnected.
Initializes the knowledge base structure in your vault. Creates CLAUDE.md schema, sets up PARA folder conventions, and generates top-level _index.md files for each PARA category.
Processes notes from your Inbox. The LLM reads each raw note, determines which PARA page it belongs to (or creates a new one), and merges the knowledge — updating wikilinks, adding cross-references, and moving the source note to Archives when done.
Answers a question using the compiled knowledge base. Reads relevant PARA pages directly rather than performing fuzzy retrieval over raw notes. Returns a cited answer with links to the source pages. When hooks or wrappers are configured, it can also leave compact query telemetry for later cost and usage analysis.
Audits the knowledge base. Checks for broken wikilinks, orphaned notes, pages with no backlinks, topics mentioned in CLAUDE.md that have no corresponding page, and pages that haven't been updated in a configurable period.
Regenerates _index.md files:
- Top-level (
0. Common/index.md): full map of all PARA categories and key topics - Category-level (
1. Projects/_index.md,2. Areas/_index.md, etc.): topic lists with one-line summaries - Project hubs, optional (
1. Projects/<slug>/_index.md): local entrypoints for larger project folders when the vault uses that convention
Two-tier indexing keeps navigation fast even as the vault grows:
- Top-level
0. Common/index.mdgives a full vault map - Per-category
_index.mdgives a focused topic list - Optional project hubs give large projects a local map without forcing every vault into deeper indexing
The KB is designed to be navigable four ways simultaneously:
- Folders — PARA hierarchy provides structure
- Tags — status, type, and topic tags for filtered views
- Wikilinks + Backlinks — every concept links forward and backward
- Indexes —
_index.mdfiles for when you want a map, not a search
During ingest, the LLM automatically generates wikilinks between related concepts. New terms are registered in CLAUDE.md so future ingests stay consistent.
When the Obsidian CLI (built into Obsidian 1.12+) is available, skills use it for precise vault operations:
| CLI Command | Used by | Purpose |
|---|---|---|
obsidian search query="term" |
kb-query, kb-ingest | Full-text vault search |
obsidian backlinks file="Note" |
kb-query, kb-lint | Find referencing documents |
obsidian tags |
kb-lint, kb-ingest | Tag inventory and consistency |
obsidian property:set |
kb-ingest | Frontmatter updates |
obsidian read file="Note" |
kb-query | Read note content |
Without CLI, all skills fall back to Grep/Glob/Read tools — fully functional but slightly less precise for backlink resolution and search.
For small vaults, answers may be all you need. For larger or heavily used vaults, it becomes useful to know whether searches are getting slower or more expensive over time.
This repository provides the telemetry schema and skill guidance. It does not install a universal hook, because Claude Code, Codex, local LLM wrappers, and other runtimes expose different event surfaces.
When your Claude/Codex hooks, wrappers, or local automation support it, kb-query can write compact records to:
0. Common/query-telemetry.jsonl
This file is operational telemetry, not knowledge content. It is not added to indexes and should not be read during normal search. It is only used when you want to analyze usage patterns such as:
- average documents inspected per query
- route mix: direct folder vs tag search vs full-text search
- elapsed time and tool calls per query
- token usage when the agent runtime exposes it
- whether a vault is becoming heavy enough to need better indexes, project hubs, or a graph/search backend
The plugin does not require a specific hook implementation. Numeric fields should come from the runtime when available; the skill only adds compact semantic hints such as route, entrypoints, and documents materially inspected.
A sanitized sample is available at examples/0. Common/query-telemetry.jsonl.
# Add the marketplace
/plugin marketplace add ernestolee13/para-knowledge-base
# Install the plugin
/plugin install para-knowledge-base@para-knowledge-basegit clone https://github.com/ernestolee13/para-knowledge-base.git
# Then add to your Claude Code plugin settingsThen open Claude Code in your vault directory. Skills become available as /kb-init, /kb-ingest, /kb-query, /kb-lint, /kb-index.
This plugin works best alongside kepano's obsidian-skills, which adds Obsidian CLI commands, markdown syntax, bases, and canvas skills. Together they cover both vault management (this plugin) and content creation (obsidian-skills).
# Install both
/plugin marketplace add kepano/obsidian-skills
/plugin install obsidian@obsidian-skills
/plugin marketplace add ernestolee13/para-knowledge-base
/plugin install para-knowledge-base@para-knowledge-base- Open Claude Code with your Obsidian vault as the working directory.
- Run
/kb-init— createsCLAUDE.mdschema,_index.mdfiles, andlog.md. - Drop notes into your
Inbox/folder. - Run
/kb-ingest— classifies, moves, links, and indexes Inbox documents. - Ask questions with
/kb-query "What do I know about X?". - Run
/kb-lintperiodically to check vault health. - Run
/kb-indexto rebuild all indexes after major reorganization.
Optional: if your agent runtime supports hooks or wrappers, configure query telemetry so future reports can analyze search cost, document depth, and response time trends.
After /kb-init, expect a small operating layer similar to:
CLAUDE.md
Inbox/
0. Common/index.md
0. Common/log.md
1. Projects/_index.md
2. Areas/_index.md
3. Resources/_index.md
4. Archive/_index.md
Larger or frequently queried projects may also use optional local hubs such as 1. Projects/<slug>/_index.md. Query telemetry, if configured, should live at 0. Common/query-telemetry.jsonl and stay out of indexes.
Next: For daily workflow, automation patterns (morning routine auto-ingest, weekly review auto-lint), Inbox input channels, project move automation, and troubleshooting, see USAGE.md. Korean guide: USAGE.ko.md.
The Obsidian CLI enhances search, backlink traversal, and tag operations. It's optional but recommended.
Requirements: Obsidian Desktop v1.12.0+ (installer version, not just app update).
- Download latest installer from https://obsidian.md/download
- Replace
/Applications/Obsidian.app(vault data is preserved) - Open Obsidian → Settings → General → Command line interface → Enable
- Restart terminal, verify:
obsidian help
PATH is auto-added to ~/.zprofile. For other shells:
export PATH="$PATH:/Applications/Obsidian.app/Contents/MacOS"- Download latest installer from https://obsidian.md/download
- Install over existing (v1.12.4+ required for Windows)
- Open Obsidian → Settings → General → CLI → Enable
- Follow shell-specific PATH instructions in the app
All skills work without the CLI using file-based fallback (Grep, Glob, Read). You get full functionality — the CLI just makes search and backlink operations faster and more precise.
This plugin is designed to complement kepano's obsidian-skills plugin. If you have both installed:
obsidian-skillsprovides general Obsidian CLI, markdown, bases, and canvas skillspara-knowledge-baseadds PARA-specific knowledge management on top- Both share the same
obsidianCLI binary — no conflict
- llm-wiki-diagnostics — a Markdown-only guide package for diagnosing whether an LLM-managed wiki is well structured and whether queries are becoming expensive over time.
- Use this plugin to build and maintain the PARA knowledge layer; use
llm-wiki-diagnosticsperiodically to audit structure, request-level usage cost, telemetry gaps, and report quality.
Suggested GitHub topics for this ecosystem: llm-wiki, obsidian, knowledge-base, pkm, para, query-telemetry.
- Claude Code
- Obsidian vault with PARA folder structure (numbered:
1. Projects/,2. Areas/,3. Resources/,4. Archive/) - Optional: Obsidian 1.12+ for CLI integration (see setup above)
- Recommended: kepano/obsidian-skills — provides Obsidian CLI, markdown, bases, and canvas skills that complement this plugin
Issues and pull requests welcome. Please open an issue first for major changes.
MIT — see LICENSE