name

description

license

metadata

context-engine

Context management engine for AI coding agents. Handles context window optimization, persistent memory across sessions, context retrieval strategies, token budget allocation, and knowledge graph construction from codebases. Use when building agent memory systems, optimizing context windows, designing RAG pipelines for code, or managing multi-session agent state.

MIT + Commons Clause

version	author	category	domain	tier	updated	frameworks
1.0.0	borghei	engineering	ai-agents	POWERFUL	2026-03-09	context-window-optimization, memory-architecture, knowledge-graphs

Context Engine - AI Agent Context Management

Tier: POWERFUL Category: Engineering Tags: context management, AI agents, memory systems, RAG, token optimization, knowledge graphs

Overview

Context Engine provides production-grade patterns for managing what AI agents know, remember, and retrieve. It covers the full lifecycle: ingestion of project knowledge, optimal packing of context windows, persistent memory across sessions, and retrieval-augmented generation for large codebases. The difference between a useful agent and a hallucinating one is context management.

Core Capabilities

1. Context Window Architecture

Every AI agent operates within a finite context window. Mismanaging it is the #1 cause of degraded agent performance.

Token Budget Allocation Framework

Segment	Budget %	Purpose	Priority
System Instructions	5-10%	Agent identity, rules, constraints	Fixed (always loaded)
Task Context	20-30%	Current task description, requirements	High (per-request)
Relevant Code	25-40%	Source files, dependencies, types	Dynamic (retrieved)
Conversation History	10-20%	Prior turns, decisions made	Sliding window
Tool Results	5-15%	Command output, search results	Ephemeral
Reserved Buffer	5-10%	Output generation headroom	Protected

Context Packing Strategies

Greedy Relevance Packing

1. Score all candidate context by relevance to current task
2. Sort by score descending
3. Pack until budget exhausted
4. Always reserve output buffer

Pros: Simple, fast, works well for focused tasks
Cons: Misses cross-cutting context, no diversity

Tiered Loading

Tier 0 (always loaded): System prompt, project rules, active file
Tier 1 (task-specific):  Related files, type definitions, tests
Tier 2 (on-demand):      Documentation, examples, history
Tier 3 (retrieved):      Search results, RAG chunks

Pros: Predictable, debuggable, respects fixed costs
Cons: Requires upfront tier classification

Adaptive Compression

1. Load full context for first pass
2. Identify low-signal sections (boilerplate, repetitive code)
3. Summarize or truncate low-signal sections
4. Re-pack with compressed context
5. Preserve high-signal sections verbatim

Pros: Maximizes information density
Cons: Risk of losing important details in compression

2. Memory Architecture

Three-Layer Memory Model

┌─────────────────────────────────────────────────┐
│  Layer 1: Working Memory (Context Window)        │
│  Scope: Current conversation/task                │
│  Lifetime: Single session                        │
│  Storage: In-context tokens                      │
│  Update: Every turn                              │
├─────────────────────────────────────────────────┤
│  Layer 2: Session Memory (Persistent Store)      │
│  Scope: Project-level learnings                  │
│  Lifetime: Across sessions                       │
│  Storage: MEMORY.md, .claude/rules/, CLAUDE.md   │
│  Update: End of session or on discovery          │
├─────────────────────────────────────────────────┤
│  Layer 3: Knowledge Base (Indexed Corpus)        │
│  Scope: Full codebase + documentation            │
│  Lifetime: Persistent, versioned                 │
│  Storage: Vector store, graph DB, file index     │
│  Update: On commit / scheduled reindex           │
└─────────────────────────────────────────────────┘

Memory Promotion Protocol

Knowledge flows upward through layers based on recurrence and value:

Signal	Action	Example
Pattern seen 1x	Working memory only	"This file uses tabs"
Pattern seen 2-3x	Candidate for session memory	"Project uses pnpm everywhere"
Pattern confirmed across sessions	Promote to CLAUDE.md/rules	"Always use pnpm, never npm"
Pattern is domain knowledge	Add to knowledge base	"Auth flow uses JWT + refresh tokens"

Staleness Detection

Context has a shelf life. Stale context causes hallucinations.

Freshness Score = f(last_verified, change_frequency, confidence)

Fresh   (< 7 days, file unchanged):  Use directly
Aging   (7-30 days, file changed):   Re-verify before using
Stale   (> 30 days):                 Flag, re-retrieve, or discard
Unknown (never verified):            Treat as low-confidence

3. Retrieval Strategies for Code

File-Level Retrieval

Best for: navigating to the right file when the agent knows what it needs.

Query: "authentication middleware"
Strategy:
  1. Filename pattern match: *auth*, *middleware*
  2. Import graph: files that import auth modules
  3. Symbol search: exported functions matching auth*
  4. Content search: files containing auth-related patterns
  5. Rank by: recency of edit + import centrality + name match

Chunk-Level Retrieval (RAG for Code)

Best for: finding specific implementations within large files.

Chunking Strategy for Source Code:

Chunk by function/class boundaries (never mid-function)
Include the function signature + docstring + body as one chunk
Attach metadata: file path, language, exports, imports
Overlap: include 2 lines above/below for context
Max chunk size: 200 lines (larger functions get sub-chunked by logical block)

Embedding Considerations:

Code-specific embeddings (CodeBERT, StarCoder embeddings) outperform general text embeddings by 15-30% on code retrieval tasks
Hybrid search (keyword + semantic) outperforms either alone
Index function signatures separately for fast symbol lookup

Dependency-Aware Retrieval

When retrieving a function, also retrieve:

Its type definitions (interfaces, types it uses)
Its direct dependencies (imported functions it calls)
Its tests (to understand expected behavior)
Its callers (to understand usage context)

This "context neighborhood" approach prevents the agent from seeing a function in isolation.

4. Knowledge Graph Construction

Codebase Graph Schema

Nodes:
  - File (path, language, size, last_modified)
  - Function (name, signature, docstring, complexity)
  - Class (name, methods, properties, inheritance)
  - Module (name, exports, dependencies)
  - Test (name, covers, assertions)
  - Config (type, values, affects)

Edges:
  - IMPORTS (File → File)
  - CALLS (Function → Function)
  - IMPLEMENTS (Class → Interface)
  - TESTS (Test → Function)
  - CONFIGURES (Config → Module)
  - DEPENDS_ON (Module → Module)

Graph Queries for Context

Agent Question	Graph Query	Context Retrieved
"How does auth work?"	Subgraph around auth module, 2 hops	Auth files + dependencies + tests
"What breaks if I change X?"	Reverse dependency traversal from X	All callers + their tests
"What's the API surface?"	All exported functions from API modules	Route handlers + types + middleware
"How is this tested?"	TEST edges from target function	Test files + fixtures + mocks

5. Context Window Optimization Patterns

Pattern: Sliding Window with Anchors

For long conversations, maintain fixed "anchor" messages while sliding recent history.

[System Prompt]           ← Fixed anchor (never evicted)
[Task Definition]         ← Fixed anchor
[Key Decision #1]         ← Pinned (user marked as important)
[Key Decision #2]         ← Pinned
...
[Turn N-4]                ← Sliding window starts here
[Turn N-3]
[Turn N-2]
[Turn N-1]
[Current Turn]
[Output Buffer]           ← Reserved

Pattern: Progressive Summarization

When conversation exceeds budget:

Summarize oldest turns into a "conversation summary" block
Keep the summary as a single anchor message
Update summary every N turns
Always keep: first system message, task definition, last 5 turns

Pattern: Selective Tool Result Caching

Tool outputs (file reads, search results, command output) consume the most tokens.

Strategy:
  - Cache tool results keyed by (tool, args, file_hash)
  - On re-request: serve from cache (0 new tokens)
  - On file change: invalidate cache for that file
  - Always truncate: command output > 200 lines → first 50 + last 50
  - Never cache: error output (always show in full)

6. Multi-Agent Context Sharing

When multiple agents collaborate, context synchronization becomes critical.

Shared Context Bus

┌──────────┐     ┌──────────────────┐     ┌──────────┐
│ Agent A   │────▶│  Shared Context   │◀────│ Agent B   │
│ (Planner) │     │  - Task state     │     │ (Coder)   │
└──────────┘     │  - Decisions log  │     └──────────┘
                  │  - File changes   │
┌──────────┐     │  - Constraints    │     ┌──────────┐
│ Agent C   │────▶│  - Artifacts      │◀────│ Agent D   │
│ (Reviewer)│     └──────────────────┘     │ (Tester)  │
└──────────┘                               └──────────┘

Context Handoff Protocol

When Agent A passes work to Agent B:

State Summary: What was done, decisions made, current state
Relevant Artifacts: Files created/modified, with paths
Constraints: What must not be changed, invariants
Open Questions: Unresolved decisions that need Agent B's input
Next Steps: Explicit instructions for what Agent B should do

Anti-pattern: Passing the entire conversation history. Always summarize.

Workflows

Workflow 1: Bootstrap Agent Context for a New Codebase

Step 1: Index the codebase
  - Build file tree with metadata (language, size, last modified)
  - Extract all exports, imports, and dependency edges
  - Identify entry points (main files, route handlers, CLI commands)

Step 2: Construct initial knowledge graph
  - Map module dependencies
  - Identify architectural layers (API, service, data, config)
  - Detect frameworks and conventions (naming, structure, patterns)

Step 3: Generate project summary
  - One paragraph: what this project does
  - Architecture diagram (text-based)
  - Key directories and their roles
  - Critical files (config, entry points, shared types)

Step 4: Configure context tiers
  - Tier 0: Project summary, CLAUDE.md, active file
  - Tier 1: Related files within same module
  - Tier 2: Cross-module dependencies
  - Tier 3: Documentation and examples

Workflow 2: Optimize Context for a Specific Task

Step 1: Parse task requirements
  - Extract entities (files, functions, features mentioned)
  - Identify task type (bug fix, feature, refactor, review)

Step 2: Retrieve relevant context
  - File-level: files matching entities
  - Dependency-level: imports/exports of matched files
  - Test-level: tests covering matched code
  - History-level: recent changes to matched files

Step 3: Budget allocation
  - Calculate total tokens available
  - Allocate per tier (see Token Budget Framework)
  - Pack context with greedy relevance

Step 4: Verify coverage
  - Check: all mentioned files included?
  - Check: type definitions for used types included?
  - Check: test examples for expected behavior included?
  - If gaps: retrieve missing context from lower tiers

Workflow 3: Session Memory Management

Step 1: During session - capture learnings
  - New patterns discovered: log to working memory
  - Corrections received: mark as high-confidence learning
  - Errors encountered: log with resolution

Step 2: End of session - evaluate learnings
  - Which learnings are project-specific vs session-specific?
  - Which patterns recurred during this session?
  - Which corrections should become rules?

Step 3: Promote valuable learnings
  - Recurring patterns → CLAUDE.md or .claude/rules/
  - Project conventions → project documentation
  - Error resolutions → knowledge base

Step 4: Prune stale memory
  - Remove learnings about deleted files
  - Update learnings contradicted by new information
  - Archive session-specific context

Anti-Patterns

Anti-Pattern	Problem	Better Approach
Dumping entire files into context	Wastes tokens on irrelevant code	Retrieve specific functions/sections
No output buffer reservation	Agent output gets truncated	Always reserve 10-15% for output
Static context loading	Same context regardless of task	Dynamic retrieval based on task type
No staleness tracking	Using outdated information	Timestamp and verify before using
Full conversation replay	Older turns crowd out relevant code	Sliding window with summarization
Ignoring import graph	Missing type definitions, broken understanding	Always include direct dependencies

Evaluation Metrics

Metric	Description	Target
Context Relevance	% of loaded context actually used in response	> 70%
Retrieval Precision	% of retrieved items that are relevant	> 80%
Token Utilization	% of context budget used productively	> 85%
Staleness Rate	% of context items that are outdated	< 5%
Cache Hit Rate	% of tool results served from cache	> 40%
Handoff Completeness	% of required context passed between agents	100%

Integration Points

Skill	Integration
rag-architect	Use RAG Architect for vector store design; Context Engine for retrieval strategy
agent-designer	Agent Designer defines agent roles; Context Engine manages what each agent knows
self-improving-agent	Self-Improving Agent promotes learnings; Context Engine decides when/how to load them
observability-designer	Monitor context utilization metrics alongside agent performance

References

references/context-window-strategies.md - Detailed packing algorithms and benchmarks
references/code-retrieval-patterns.md - RAG for code: chunking, embedding, and ranking strategies
references/memory-architecture-guide.md - Multi-layer memory system design patterns

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context Engine - AI Agent Context Management

Overview

Core Capabilities

1. Context Window Architecture

Token Budget Allocation Framework

Context Packing Strategies

2. Memory Architecture

Three-Layer Memory Model

Memory Promotion Protocol

Staleness Detection

3. Retrieval Strategies for Code

File-Level Retrieval

Chunk-Level Retrieval (RAG for Code)

Dependency-Aware Retrieval

4. Knowledge Graph Construction

Codebase Graph Schema

Graph Queries for Context

5. Context Window Optimization Patterns

Pattern: Sliding Window with Anchors

Pattern: Progressive Summarization

Pattern: Selective Tool Result Caching

6. Multi-Agent Context Sharing

Shared Context Bus

Context Handoff Protocol

Workflows

Workflow 1: Bootstrap Agent Context for a New Codebase

Workflow 2: Optimize Context for a Specific Task

Workflow 3: Session Memory Management

Anti-Patterns

Evaluation Metrics

Integration Points

References

FilesExpand file tree

SKILL.md

Latest commit

History

SKILL.md

File metadata and controls

Context Engine - AI Agent Context Management

Overview

Core Capabilities

1. Context Window Architecture

Token Budget Allocation Framework

Context Packing Strategies

2. Memory Architecture

Three-Layer Memory Model

Memory Promotion Protocol

Staleness Detection

3. Retrieval Strategies for Code

File-Level Retrieval

Chunk-Level Retrieval (RAG for Code)

Dependency-Aware Retrieval

4. Knowledge Graph Construction

Codebase Graph Schema

Graph Queries for Context

5. Context Window Optimization Patterns

Pattern: Sliding Window with Anchors

Pattern: Progressive Summarization

Pattern: Selective Tool Result Caching

6. Multi-Agent Context Sharing

Shared Context Bus

Context Handoff Protocol

Workflows

Workflow 1: Bootstrap Agent Context for a New Codebase

Workflow 2: Optimize Context for a Specific Task

Workflow 3: Session Memory Management

Anti-Patterns

Evaluation Metrics

Integration Points

References