GitHub - calinfaja/K-LEAN: Cross-platform toolkit to enhance Claude Code with multi-LLM consensus, 8 specialist agents, semantic knowledge search, and one-command install.

Second opinions from multiple LLMs—right inside Claude Code

Why K-LEAN?

Need a second opinion on your code? Want validation before merging? Looking for domain expertise your model doesn't have? Stuck in a loop and need fresh eyes to break out?

One model's confidence isn't proof. K-LEAN brings in OpenAI, Gemini, DeepSeek, Moonshot, Minimax, and more—when multiple models agree, you ship with confidence.

9 slash commands — /kln:quick, /kln:multi, /kln:agent, /kln:rethink...
8 specialist agents — Security, Rust, embedded C, ARM Cortex, performance
5 smart hooks — Service auto-start, keyword handling, git tracking, web capture, session log
Persistent knowledge — Insights that survive across sessions

Access any model via NanoGPT or OpenRouter, directly from Claude Code.

Works on Windows, Linux, and macOS — native cross-platform support, no shell scripts required.

Quick Start

1. Get an API Key (required)

Choose one provider and get your API key:

NanoGPT — Subscription access to DeepSeek, Qwen, GLM, Kimi
OpenRouter — Unified access to GPT, Gemini, Claude

2. Install

Linux / macOS:

# Install pipx if you don't have it
python3 -m pip install --user pipx
python3 -m pipx ensurepath

# Install K-LEAN
pipx install kln-ai

Windows (PowerShell):

# Install pipx if you don't have it
python -m pip install --user pipx
python -m pipx ensurepath

# Restart PowerShell, then install K-LEAN
pipx install kln-ai

3. Setup

kln init                  # Select provider, enter API key
kln start                 # Start LiteLLM proxy
kln status                # Verify everything works

Or non-interactive:

kln init --provider nanogpt --api-key $NANOGPT_API_KEY
kln start

Knowledge-Only Install (no API key needed)

If you only want the persistent knowledge system without multi-model reviews:

pipx install kln-ai
kln init --provider skip

This installs the knowledge database, session hooks, and slash commands (/kln:learn, /kln:remember, /kln:find). No LiteLLM proxy or API keys required. Add a provider later with kln init --provider nanogpt --api-key $KEY.

4. Use in Claude Code

/kln:quick "security"          # Fast review (~30s)
/kln:multi "error handling"    # 3-5 model consensus (~60s)
/kln:agent security-auditor    # Specialist agent (~2min)

Optional: Add More Models

kln model add --provider openrouter "anthropic/claude-3.5-sonnet"
kln model remove "claude-3-sonnet"
kln start  # Restart to apply changes

See It In Action

$ /kln:multi "review authentication flow"

GRADE: B+ | RISK: MEDIUM

HIGH CONFIDENCE (4/5 models agree):
  - auth.py:42 - SQL injection risk in user query
  - session.py:89 - Missing token expiration check

MEDIUM CONFIDENCE (2/5 models agree):
  - login.py:15 - Consider rate limiting

What You Get

1. Second Opinions on Demand

Three ways to get external perspectives—pick based on speed vs depth:

Command	What Happens	Time
`/kln:quick`	1 model reviews code you provide	~30s
`/kln:multi`	3-5 models vote on same code	~60s
`/kln:rethink`	Contrarian techniques when you're stuck	~20s

/kln:quick — You gather the code (git diff, file content), one model reviews it fast.

/kln:quick "security review"
# Grade: B+ | Risk: MEDIUM | 3 findings

/kln:multi — Same code goes to 5 models in parallel. When 4/5 agree, it's real.

/kln:multi "check error handling"
# 4/5 AGREE: Missing null check at line 42

/kln:rethink — Stuck debugging 10+ minutes? Get contrarian ideas: inversion, assumption challenge, domain shift.

/kln:rethink
# "What if the bug isn't in the parser—what if the input is already corrupt?"

How: LiteLLM proxy routes to multiple providers (NanoGPT, OpenRouter). Dynamic model discovery, parallel async execution, response aggregation with consensus scoring.

2. Knowledge That Sticks

Your insights survive sessions. Capture mid-session or end-of-session:

/kln:learn — Extract learnings NOW, while context is fresh.

/kln:learn "JWT issue"
# Found 3 learnings → Saved to Knowledge DB

/kln:remember — End of session. Reviews git diff, extracts warnings/patterns/decisions, syncs to Serena MCP.

/kln:remember
# Saved 6 entries (2 warnings, 2 patterns, 1 solution, 1 decision)
# Synced to Serena kln-lessons-learned

/kln:find — Search anytime. Supports date, branch, and type filters.

/kln:find "JWT validation"
/kln:find auth since:2026-02-01
/kln:find type:decision since:2026-02-03
/kln:find auth branch:feature/auth

How: Per-project knowledge database with hybrid search—dense embeddings (BGE-small via fastembed) + sparse matching (BM42) + RRF fusion + cross-encoder reranking. Runs locally via ONNX, <100ms queries. V3.1 schema adds temporal filtering by date, branch, and entry type. Learnings are also auto-extracted on /compact via PreCompact hook.

No API key? Knowledge DB works fully offline. You can still use /kln:learn, /kln:remember, and /kln:find without NanoGPT or OpenRouter—embeddings run locally on your machine.

3. Agents That Explore

Unlike models that review what you give them, agents read your codebase themselves.

8 specialists with tools: read_file, grep, search_files, knowledge_search, get_complexity.

Agent	Expertise
`code-reviewer`	OWASP Top 10, SOLID, code quality
`security-auditor`	Vulnerabilities, auth, crypto
`debugger`	Root cause analysis
`performance-engineer`	Profiling, optimization
`rust-expert`	Ownership, lifetimes, unsafe
`c-pro`	C99/C11, POSIX, memory
`arm-cortex-expert`	Embedded ARM, real-time
`orchestrator`	Multi-agent coordination

/kln:agent security-auditor "audit payment module"
# Agent greps for payment → reads 3 files → finds 2 vulnerabilities

/kln:agent rust-expert --model qwen3-coder "review unsafe blocks"
# Want a specific LLM? Use --model to pick your expert

--parallel — Need multiple perspectives? Run 3 specialists at once:

/kln:agent --parallel "review auth system"
# code-reviewer + security-auditor + performance-engineer → unified report

How: Built on smolagents with LiteLLM integration. Multi-step reasoning, tool use, and memory persistence.

4. Hooks That Work in Background

5 hooks run automatically—you don't call them:

Hook	Trigger	What It Does
`session-start`	Claude Code opens	Starts LiteLLM + Knowledge Server
`user-prompt`	You type `InitKB`	Initialize project knowledge DB
`post-bash`	After bash commands	Git commits, test failures, build errors, package installs
`post-web`	After WebFetch	Doc URLs captured as discoveries
`pre-compact`	Context compaction	Session log via Haiku + auto-extract learnings to KB

How: Claude Code hook system with pattern matching. Services auto-start on session begin. Git commits, test failures, build errors, and doc URLs auto-captured to KB with timestamp and branch metadata. On compaction, Haiku extracts 0-5 atomic learnings from the full conversation.

5. Knowledge System Architecture

The knowledge system has three layers: capture, storage, and retrieval. Everything runs locally, per-project, with no external services.

Auto-Capture (always running)

Every session automatically captures knowledge without any commands:

Source	What's Captured	Entry Type
Git commits	Commit message + SHA + branch	`commit`
Test failures	Failure output + file path	`finding`
Build errors	Error message + context	`warning`
Package installs	Package name + version	`discovery`
Doc URLs	URL content evaluated by LLM	`discovery`
Session compaction	Session changelog via Claude Haiku	`session`

Session Log Pipeline (PreCompact hook)

When context gets compacted, the system automatically generates a structured session changelog:

Transcript JSONL (thousands of lines)
    |
    v
[1] Delta extraction -----> Only lines since last compaction
    |                        (uses compact_boundary markers)
    v
[2] Noise filtering ------> Drop tool-only turns, filler text,
    |                        system tags, slash command defs
    v
[3] Clean dialogue -------> USER: messages + CLAUDE: text responses
    |                        (~20% signal ratio from raw transcript)
    v
[4] Enrich with context --> + git log (18h window, with commit bodies)
    |                       + KB entries captured today
    v
[5] Claude Haiku ---------> Structured markdown:
    |                        Accomplished / Decisions / Discovered / Carry Forward
    v
[6] Persist --------------> .serena/memories/kln-session-YYYY-MM-DD.md
                             + searchable KB entry (type: session)

Multiple compactions per day append to the same log file (separated by ---). Each compaction only processes the conversation delta since the last one, so there's no overlap between entries.

Hybrid Search

Queries go through a 5-stage pipeline for high-quality results:

Query --> Dense embeddings (BGE-small) --> Sparse matching (BM42)
              |                                |
              v                                v
         Dense scores                    Sparse scores
              |                                |
              +--> RRF Fusion <----------------+
                       |
                       v
              Post-RRF filtering (date, branch, type)
                       |
                       v
              Cross-encoder reranking (MiniLM)
                       |
                       v
              Final ranked results

All models run locally via ONNX (fastembed). No API calls, no cloud. Queries return in <100ms via a TCP server that stays warm between searches.

Cross-Session Continuity

At session start, the system injects context from previous sessions:

[SESSION] Last: Fix JWT race condition (abc1234) | Next: Integration tests
[!] WARNINGS (2): "SQL injection in login" | "Deprecated API usage"
[KB] PINNED: <high-priority entries>
[KB] RECENT: <latest findings, solutions, patterns>

This means every new session starts with awareness of what happened before -- carry-forward items, active warnings, and recent discoveries. The Knowledge DB acts as long-term memory that persists across sessions, compactions, and context limits.

Storage: Per-project .knowledge-db/ directory with entries.jsonl (append-only), dense/sparse index files, and a TCP server for fast queries. Schema V3.1 supports 9 entry types with date, branch, and type filtering.

6. Status Line

[opus 4.5] │ claudeAgentic │ git:(main●) +27-23 │ llm:16 kb:42

Model. Project. Branch (● = dirty). Lines changed. Models ready. KB entry count.

How: Custom statusline polling LiteLLM and Knowledge DB via TCP on each prompt.

All Commands

Command	Description	Time
`/kln:quick <focus>`	Single model review	~30s
`/kln:multi <focus>`	3-5 model consensus	~60s
`/kln:agent <role>`	Specialist agent with tools	~2min
`/kln:rethink`	Contrarian debugging	~20s
`/kln:find <query>`	Search knowledge DB	~5s
`/kln:learn`	Capture insights from context	~10s
`/kln:remember`	End-of-session knowledge capture	~20s
`/kln:doc <title>`	Generate session docs	~30s
`/kln:status`	System health check	~2s
`/kln:help`	Command reference	instant

Flags: --async (background), --models N (count), --output json|text

CLI Reference

# Setup (unified)
kln init             # Initialize: install + configure provider (NanoGPT, OpenRouter, skip)

# Installation & Management
kln install          # Install to ~/.claude/
kln uninstall        # Remove components
kln status           # Show component status

# Services
kln start            # Start LiteLLM proxy
kln stop             # Stop all services

# Diagnostics
kln doctor           # Check configuration
kln doctor -f        # Auto-fix issues

# Model Management (subgroup)
kln model list       # List available models
kln model list --health  # Check model health
kln model add        # Add individual model
kln model remove     # Remove model
kln model test       # Test a specific model

# Provider Management (subgroup)
kln provider list    # Show configured providers
kln provider add     # Add provider with recommended models
kln provider set-key # Update API key
kln provider remove  # Remove provider

# Review
kln multi            # Run multi-agent orchestrated review

Requirements

Requirement	Version	Notes
Python	3.9+	`python3 --version`
Claude Code	2.0+	`claude --version`
pipx	any	`pipx --version`
API Key	-	NanoGPT or OpenRouter (optional for knowledge-only)

Recommended Providers

K-LEAN comes with curated model sets for each provider—no manual configuration needed.

NanoGPT

NanoGPT — Subscription access to top-tier models.

10 models pre-configured:

Model	Provider	Specialty
`deepseek-r1`	DeepSeek	Reasoning, code review
`deepseek-v3.2`	DeepSeek	Fast general purpose
`qwen3-coder`	Alibaba	Code-focused
`glm-4.7`	Zhipu	Multilingual
`kimi-k2`	Moonshot	Long context
`llama-4-maverick`	Meta	Creative
`llama-4-scout`	Meta	Analytical
`mimo-v2-flash`	Xiaomi	Fast inference
`gpt-oss-120b`	OpenAI-OSS	Large capacity
`devstral-2-123b`	Mistral	Code generation

+4 thinking models (auto-configured): deepseek-v3.2-thinking, glm-4.7-thinking, kimi-k2-thinking, deepseek-r1-thinking

OpenRouter

OpenRouter — Unified API for multiple providers.

6 models pre-configured:

Model	Provider	Specialty
`gemini-3-flash`	Google	Fast, multimodal
`gemini-2.5-flash`	Google	Balanced
`gpt-5-mini`	OpenAI	Efficient
`gpt-5.1-codex-mini`	OpenAI	Code-focused
`qwen3-coder-plus`	Alibaba	Enhanced coding
`deepseek-v3.2`	DeepSeek	Reasoning

Recommended Add-ons

For a complete coding experience:

Tool	Integration
SuperClaude	Use `/sc:` and `/kln:` together
Serena MCP	Shared memory, code understanding
Context7 MCP	Documentation lookup
Tavily MCP	Web search for research
Sequential Thinking MCP	Step-by-step reasoning for complex problems

Telemetry: Install Phoenix to watch agent steps and reviews at localhost:6006.

Documentation

Document	Description
Installation	Detailed setup guide
Usage	Commands, workflows, examples
Reference	Complete config reference
Architecture	System design

Contributing

git clone https://github.com/calinfaja/K-LEAN.git
cd k-lean
pipx install -e .
kln install --dev
kln admin test

See CONTRIBUTING.md for guidelines.

License

Apache 2.0 — See LICENSE

Get second opinions. Ship with confidence.

Name		Name	Last commit message	Last commit date
Latest commit History 371 Commits
.archive/toon		.archive/toon
.github		.github
assets		assets
docs		docs
src/klean		src/klean
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Manual_Testing_FULL_SYSTEM.md		Manual_Testing_FULL_SYSTEM.md
PROJECT_INDEX.json		PROJECT_INDEX.json
PROJECT_INDEX.md		PROJECT_INDEX.md
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

License

calinfaja/K-LEAN

Folders and files

Latest commit

History

Repository files navigation

Why K-LEAN?

Quick Start

1. Get an API Key (required)

2. Install

3. Setup

Knowledge-Only Install (no API key needed)

4. Use in Claude Code

Optional: Add More Models

See It In Action

What You Get

1. Second Opinions on Demand

2. Knowledge That Sticks

3. Agents That Explore

4. Hooks That Work in Background

5. Knowledge System Architecture

Auto-Capture (always running)

Session Log Pipeline (PreCompact hook)

Hybrid Search

Cross-Session Continuity

6. Status Line

All Commands

CLI Reference

Requirements

Recommended Providers

NanoGPT

OpenRouter

Recommended Add-ons

Documentation

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages