Skip to content

Cross-platform toolkit to enhance Claude Code with multi-LLM consensus, 8 specialist agents, semantic knowledge search, and one-command install.

License

Notifications You must be signed in to change notification settings

calinfaja/K-LEAN

K-LEAN

Second opinions from multiple LLMs—right inside Claude Code

CI PyPI License Python

Platform


Why K-LEAN?

Need a second opinion on your code? Want validation before merging? Looking for domain expertise your model doesn't have? Stuck in a loop and need fresh eyes to break out?

One model's confidence isn't proof. K-LEAN brings in OpenAI, Gemini, DeepSeek, Moonshot, Minimax, and more—when multiple models agree, you ship with confidence.

  • 9 slash commands/kln:quick, /kln:multi, /kln:agent, /kln:rethink...
  • 8 specialist agents — Security, Rust, embedded C, ARM Cortex, performance
  • 5 smart hooks — Service auto-start, keyword handling, git tracking, web capture, session log
  • Persistent knowledge — Insights that survive across sessions

Access any model via NanoGPT or OpenRouter, directly from Claude Code.

Works on Windows, Linux, and macOS — native cross-platform support, no shell scripts required.


Quick Start

1. Get an API Key (required)

Choose one provider and get your API key:

  • NanoGPT — Subscription access to DeepSeek, Qwen, GLM, Kimi
  • OpenRouter — Unified access to GPT, Gemini, Claude

2. Install

Linux / macOS:

# Install pipx if you don't have it
python3 -m pip install --user pipx
python3 -m pipx ensurepath

# Install K-LEAN
pipx install kln-ai

Windows (PowerShell):

# Install pipx if you don't have it
python -m pip install --user pipx
python -m pipx ensurepath

# Restart PowerShell, then install K-LEAN
pipx install kln-ai

3. Setup

kln init                  # Select provider, enter API key
kln start                 # Start LiteLLM proxy
kln status                # Verify everything works

Or non-interactive:

kln init --provider nanogpt --api-key $NANOGPT_API_KEY
kln start

Knowledge-Only Install (no API key needed)

If you only want the persistent knowledge system without multi-model reviews:

pipx install kln-ai
kln init --provider skip

This installs the knowledge database, session hooks, and slash commands (/kln:learn, /kln:remember, /kln:find). No LiteLLM proxy or API keys required. Add a provider later with kln init --provider nanogpt --api-key $KEY.

4. Use in Claude Code

/kln:quick "security"          # Fast review (~30s)
/kln:multi "error handling"    # 3-5 model consensus (~60s)
/kln:agent security-auditor    # Specialist agent (~2min)

Optional: Add More Models

kln model add --provider openrouter "anthropic/claude-3.5-sonnet"
kln model remove "claude-3-sonnet"
kln start  # Restart to apply changes

See It In Action

$ /kln:multi "review authentication flow"

GRADE: B+ | RISK: MEDIUM

HIGH CONFIDENCE (4/5 models agree):
  - auth.py:42 - SQL injection risk in user query
  - session.py:89 - Missing token expiration check

MEDIUM CONFIDENCE (2/5 models agree):
  - login.py:15 - Consider rate limiting

What You Get

1. Second Opinions on Demand

Three ways to get external perspectives—pick based on speed vs depth:

Command What Happens Time
/kln:quick 1 model reviews code you provide ~30s
/kln:multi 3-5 models vote on same code ~60s
/kln:rethink Contrarian techniques when you're stuck ~20s

/kln:quick — You gather the code (git diff, file content), one model reviews it fast.

/kln:quick "security review"
# Grade: B+ | Risk: MEDIUM | 3 findings

/kln:multi — Same code goes to 5 models in parallel. When 4/5 agree, it's real.

/kln:multi "check error handling"
# 4/5 AGREE: Missing null check at line 42

/kln:rethink — Stuck debugging 10+ minutes? Get contrarian ideas: inversion, assumption challenge, domain shift.

/kln:rethink
# "What if the bug isn't in the parser—what if the input is already corrupt?"

How: LiteLLM proxy routes to multiple providers (NanoGPT, OpenRouter). Dynamic model discovery, parallel async execution, response aggregation with consensus scoring.


2. Knowledge That Sticks

Your insights survive sessions. Capture mid-session or end-of-session:

/kln:learn — Extract learnings NOW, while context is fresh.

/kln:learn "JWT issue"
# Found 3 learnings → Saved to Knowledge DB

/kln:remember — End of session. Reviews git diff, extracts warnings/patterns/decisions, syncs to Serena MCP.

/kln:remember
# Saved 6 entries (2 warnings, 2 patterns, 1 solution, 1 decision)
# Synced to Serena kln-lessons-learned

/kln:find — Search anytime. Supports date, branch, and type filters.

/kln:find "JWT validation"
/kln:find auth since:2026-02-01
/kln:find type:decision since:2026-02-03
/kln:find auth branch:feature/auth

How: Per-project knowledge database with hybrid search—dense embeddings (BGE-small via fastembed) + sparse matching (BM42) + RRF fusion + cross-encoder reranking. Runs locally via ONNX, <100ms queries. V3.1 schema adds temporal filtering by date, branch, and entry type. Learnings are also auto-extracted on /compact via PreCompact hook.

No API key? Knowledge DB works fully offline. You can still use /kln:learn, /kln:remember, and /kln:find without NanoGPT or OpenRouter—embeddings run locally on your machine.


3. Agents That Explore

Unlike models that review what you give them, agents read your codebase themselves.

8 specialists with tools: read_file, grep, search_files, knowledge_search, get_complexity.

Agent Expertise
code-reviewer OWASP Top 10, SOLID, code quality
security-auditor Vulnerabilities, auth, crypto
debugger Root cause analysis
performance-engineer Profiling, optimization
rust-expert Ownership, lifetimes, unsafe
c-pro C99/C11, POSIX, memory
arm-cortex-expert Embedded ARM, real-time
orchestrator Multi-agent coordination
/kln:agent security-auditor "audit payment module"
# Agent greps for payment → reads 3 files → finds 2 vulnerabilities

/kln:agent rust-expert --model qwen3-coder "review unsafe blocks"
# Want a specific LLM? Use --model to pick your expert

--parallel — Need multiple perspectives? Run 3 specialists at once:

/kln:agent --parallel "review auth system"
# code-reviewer + security-auditor + performance-engineer → unified report

How: Built on smolagents with LiteLLM integration. Multi-step reasoning, tool use, and memory persistence.


4. Hooks That Work in Background

5 hooks run automatically—you don't call them:

Hook Trigger What It Does
session-start Claude Code opens Starts LiteLLM + Knowledge Server
user-prompt You type InitKB Initialize project knowledge DB
post-bash After bash commands Git commits, test failures, build errors, package installs
post-web After WebFetch Doc URLs captured as discoveries
pre-compact Context compaction Session log via Haiku + auto-extract learnings to KB

How: Claude Code hook system with pattern matching. Services auto-start on session begin. Git commits, test failures, build errors, and doc URLs auto-captured to KB with timestamp and branch metadata. On compaction, Haiku extracts 0-5 atomic learnings from the full conversation.


5. Knowledge System Architecture

The knowledge system has three layers: capture, storage, and retrieval. Everything runs locally, per-project, with no external services.

Auto-Capture (always running)

Every session automatically captures knowledge without any commands:

Source What's Captured Entry Type
Git commits Commit message + SHA + branch commit
Test failures Failure output + file path finding
Build errors Error message + context warning
Package installs Package name + version discovery
Doc URLs URL content evaluated by LLM discovery
Session compaction Session changelog via Claude Haiku session

Session Log Pipeline (PreCompact hook)

When context gets compacted, the system automatically generates a structured session changelog:

Transcript JSONL (thousands of lines)
    |
    v
[1] Delta extraction -----> Only lines since last compaction
    |                        (uses compact_boundary markers)
    v
[2] Noise filtering ------> Drop tool-only turns, filler text,
    |                        system tags, slash command defs
    v
[3] Clean dialogue -------> USER: messages + CLAUDE: text responses
    |                        (~20% signal ratio from raw transcript)
    v
[4] Enrich with context --> + git log (18h window, with commit bodies)
    |                       + KB entries captured today
    v
[5] Claude Haiku ---------> Structured markdown:
    |                        Accomplished / Decisions / Discovered / Carry Forward
    v
[6] Persist --------------> .serena/memories/kln-session-YYYY-MM-DD.md
                             + searchable KB entry (type: session)

Multiple compactions per day append to the same log file (separated by ---). Each compaction only processes the conversation delta since the last one, so there's no overlap between entries.

Hybrid Search

Queries go through a 5-stage pipeline for high-quality results:

Query --> Dense embeddings (BGE-small) --> Sparse matching (BM42)
              |                                |
              v                                v
         Dense scores                    Sparse scores
              |                                |
              +--> RRF Fusion <----------------+
                       |
                       v
              Post-RRF filtering (date, branch, type)
                       |
                       v
              Cross-encoder reranking (MiniLM)
                       |
                       v
              Final ranked results

All models run locally via ONNX (fastembed). No API calls, no cloud. Queries return in <100ms via a TCP server that stays warm between searches.

Cross-Session Continuity

At session start, the system injects context from previous sessions:

[SESSION] Last: Fix JWT race condition (abc1234) | Next: Integration tests
[!] WARNINGS (2): "SQL injection in login" | "Deprecated API usage"
[KB] PINNED: <high-priority entries>
[KB] RECENT: <latest findings, solutions, patterns>

This means every new session starts with awareness of what happened before -- carry-forward items, active warnings, and recent discoveries. The Knowledge DB acts as long-term memory that persists across sessions, compactions, and context limits.

Storage: Per-project .knowledge-db/ directory with entries.jsonl (append-only), dense/sparse index files, and a TCP server for fast queries. Schema V3.1 supports 9 entry types with date, branch, and type filtering.


6. Status Line

[opus 4.5] │ claudeAgentic │ git:(main●) +27-23 │ llm:16 kb:42

Model. Project. Branch (● = dirty). Lines changed. Models ready. KB entry count.

How: Custom statusline polling LiteLLM and Knowledge DB via TCP on each prompt.


All Commands

Command Description Time
/kln:quick <focus> Single model review ~30s
/kln:multi <focus> 3-5 model consensus ~60s
/kln:agent <role> Specialist agent with tools ~2min
/kln:rethink Contrarian debugging ~20s
/kln:find <query> Search knowledge DB ~5s
/kln:learn Capture insights from context ~10s
/kln:remember End-of-session knowledge capture ~20s
/kln:doc <title> Generate session docs ~30s
/kln:status System health check ~2s
/kln:help Command reference instant

Flags: --async (background), --models N (count), --output json|text


CLI Reference

# Setup (unified)
kln init             # Initialize: install + configure provider (NanoGPT, OpenRouter, skip)

# Installation & Management
kln install          # Install to ~/.claude/
kln uninstall        # Remove components
kln status           # Show component status

# Services
kln start            # Start LiteLLM proxy
kln stop             # Stop all services

# Diagnostics
kln doctor           # Check configuration
kln doctor -f        # Auto-fix issues

# Model Management (subgroup)
kln model list       # List available models
kln model list --health  # Check model health
kln model add        # Add individual model
kln model remove     # Remove model
kln model test       # Test a specific model

# Provider Management (subgroup)
kln provider list    # Show configured providers
kln provider add     # Add provider with recommended models
kln provider set-key # Update API key
kln provider remove  # Remove provider

# Review
kln multi            # Run multi-agent orchestrated review

Requirements

Requirement Version Notes
Python 3.9+ python3 --version
Claude Code 2.0+ claude --version
pipx any pipx --version
API Key - NanoGPT or OpenRouter (optional for knowledge-only)

Recommended Providers

K-LEAN comes with curated model sets for each provider—no manual configuration needed.

NanoGPT

NanoGPT — Subscription access to top-tier models.

10 models pre-configured:

Model Provider Specialty
deepseek-r1 DeepSeek Reasoning, code review
deepseek-v3.2 DeepSeek Fast general purpose
qwen3-coder Alibaba Code-focused
glm-4.7 Zhipu Multilingual
kimi-k2 Moonshot Long context
llama-4-maverick Meta Creative
llama-4-scout Meta Analytical
mimo-v2-flash Xiaomi Fast inference
gpt-oss-120b OpenAI-OSS Large capacity
devstral-2-123b Mistral Code generation

+4 thinking models (auto-configured): deepseek-v3.2-thinking, glm-4.7-thinking, kimi-k2-thinking, deepseek-r1-thinking

OpenRouter

OpenRouter — Unified API for multiple providers.

6 models pre-configured:

Model Provider Specialty
gemini-3-flash Google Fast, multimodal
gemini-2.5-flash Google Balanced
gpt-5-mini OpenAI Efficient
gpt-5.1-codex-mini OpenAI Code-focused
qwen3-coder-plus Alibaba Enhanced coding
deepseek-v3.2 DeepSeek Reasoning

Recommended Add-ons

For a complete coding experience:

Tool Integration
SuperClaude Use /sc:* and /kln:* together
Serena MCP Shared memory, code understanding
Context7 MCP Documentation lookup
Tavily MCP Web search for research
Sequential Thinking MCP Step-by-step reasoning for complex problems

Telemetry: Install Phoenix to watch agent steps and reviews at localhost:6006.


Documentation

Document Description
Installation Detailed setup guide
Usage Commands, workflows, examples
Reference Complete config reference
Architecture System design

Contributing

git clone https://github.com/calinfaja/K-LEAN.git
cd k-lean
pipx install -e .
kln install --dev
kln admin test

See CONTRIBUTING.md for guidelines.


License

Apache 2.0 — See LICENSE


Get second opinions. Ship with confidence.

About

Cross-platform toolkit to enhance Claude Code with multi-LLM consensus, 8 specialist agents, semantic knowledge search, and one-command install.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •