Skip to content

AI Agents v0.3.0: Agents that remember

Latest

Choose a tag to compare

@rjmurillo-bot rjmurillo-bot released this 09 Feb 10:50
Immutable release. Only release title and notes can be modified.
db95c9a

AI Agents v0.3.0: Agents that remember

AI Agents v0.3.0 brings three new agents, a rebuilt memory system, and major prompt rewrites across the board. Your agents now retrieve project context before making decisions, cite their sources with confidence scores, and traverse a knowledge graph to find connections that keyword search misses. We also added 10+ new skills and 27 shared memories you can reference in your own agent configurations.

This release includes 77 commits across 139 closed issues. Let's walk through what changed and why it matters.

Released: February 2026


Table of Contents


Three new agents

We added three agents to the catalog, bringing the total to 21. All three are available on every platform (VS Code, Copilot CLI, Claude Code).

Debug is a diagnostic specialist for root cause analysis. Point it at a failing test, a stack trace, or a misbehaving endpoint and it traces the problem systematically instead of guessing.

Backlog generator proactively discovers work when agent slots are idle. It analyzes open issues, PR status, and code health to produce 3-5 sized, actionable tasks. Unlike the task-decomposer (which breaks down existing work items), the backlog generator finds new work that needs doing.

Janitor handles codebase hygiene: dead code detection, unused import cleanup, and consistency enforcement. It handles the tedious maintenance that accumulates between features.

See PR #1105 for all three agents.


Smarter orchestrator and implementer

The two agents you use most got significant prompt rewrites.

The orchestrator gained retrieval-led reasoning, parallel execution support, and tighter handoff protocols. It auto-invokes context retrieval at the start of each task so downstream agents work with current project state. The parallel execution harness lets multiple agents work on independent tasks simultaneously with proper coordination (PR #1006, PR #1090).

The implementer now runs pre-push quality checks inline instead of waiting for CI. It validates code quality, runs tests, and checks linting before you push. The prompt was expanded with explicit coding standards, testability requirements, and atomic commit guidance (PR #1102).

We also renamed two agents for clarity: planner became milestone-planner and task-generator became task-decomposer. The old names still work but will be removed in a future release.


Agents that read before they think

This is the biggest behavioral change in v0.3.0. Every agent phase, from orchestration through implementation, now retrieves current information before making decisions. Pre-training knowledge is the last resort, never the default.

In practice, this means your agents make decisions based on what your project actually looks like today, not what the model learned during training. The orchestrator auto-invokes context retrieval. Agent prompts migrated from cloudmcp-manager to the unified Memory Router (ADR-037), which centralizes prompt management and enables memory-aware prompt construction.

We also made the MemoryRouter 26x faster. Search dropped from ~260ms to ~10ms. Agents retrieve context in under 10 milliseconds, which means less waiting and more doing (PR #1044).

See PR #1110 and PR #1090 for retrieval-led reasoning. Prompt migration in PR #1046.


Memories that cite their sources

We rebuilt the memory system from the ground up. Every memory now carries structured citations that track where the knowledge came from, when it was captured, and how reliable it is. A verification pipeline validates citations against their original sources.

A new graph traversal module connects related memories so agents can walk the knowledge graph to find relevant context. Each edge carries a confidence score based on citation quality and usage frequency. An agent researching "retry patterns" can discover related memories about "resilience" and "circuit breakers" automatically.

We unified four separate memory interfaces (Serena, Forgetful, file-based, and in-context) into a single router. You no longer need to think about which backend to use. The MemoryRouter picks the right one based on the query.

Bidirectional sync between Serena and Forgetful keeps memories consistent across both backends. Changes in one system propagate to the other automatically.

See PR #1045, PR #1009, and PR #1103 for citations. Graph traversal in PR #1013, PR #1019, and PR #1104. Interface unification in PR #1007.


New skills for your workflows

We added 10+ new skills in .claude/skills/ that you can use in your own agent configurations:

  • Buy vs. Build Framework evaluates build, buy, partner, or defer decisions with TCO analysis and vendor scoring
  • Code Qualities Assessment scores cohesion, coupling, encapsulation, testability, and non-redundancy with calibrated rubrics
  • CVA Analysis (Commonality/Variability Analysis) discovers abstractions systematically before picking patterns
  • Context Optimizer analyzes skill content for optimal placement and compresses markdown by 60-80%
  • Security Scan detects CWE-22 (path traversal) and CWE-78 (command injection) patterns before PR submission
  • Doc Coverage finds missing documentation in code and project files
  • Git Advanced Workflows handles rebasing, cherry-picking, bisect, worktrees, and reflog
  • Memory Enhancement manages citations, verifies code references, and tracks confidence scores
  • Session End validates and completes session logs with pre-commit checks
  • Style Enforcement validates code against .editorconfig, StyleCop, and project conventions

We also updated 34 existing skills to the v2.0 compliance standard with validated YAML frontmatter and consistent structure (PR #1084).


Shared memories for your agents

We added 27 new memories in .serena/memories/ that your agents can reference for project-specific knowledge. Highlights include:

  • retrieval-led-reasoning documents when and how to retrieve context before reasoning
  • passive-context-vs-skills-vercel-research captures Vercel's findings on passive context achieving 100% pass rates vs 53-79% for skills
  • claude-code-agent-teams covers parallel multi-agent execution patterns
  • buy-vs-build-framework-skill captures the decision framework for build/buy/partner/defer
  • quality-gates-bypass-enforcement documents how to prevent agents from bypassing quality checks
  • rootcause-escape-hatch-misuse patterns for when agents misuse escape hatches

These memories are available to any agent using Serena. Point your agents at the memory-index to discover what is available.


Template system improvements

The template system in templates/ gained several improvements:

Toolset abstraction (templates/toolsets.yaml) centralizes tool declarations. Instead of duplicating tool lists across every agent template, agents now reference named toolsets. This makes it easier to add or modify tools across all agents at once (PR #1036).

Consensus mechanisms give agents a structured way to resolve disagreements during multi-agent workflows like ADR debates (PR #1035).

Agent template generation (build/Generate-Agents.ps1) now produces agents for all three platforms from shared templates. Edit the template once, regenerate, and all platforms update.


Breaking changes

Agent renames

What changed: planner renamed to milestone-planner. task-generator renamed to task-decomposer.

Impact: References to the old agent names will stop working once you update.

Migration: Update any custom configurations or scripts that reference planner or task-generator to use the new names. See PR #1105.

Retrieval-led reasoning is now mandatory

What changed: All agent phases must retrieve context before reasoning.

Impact: Agents that skip retrieval will produce lower-quality results because they rely on pre-training instead of current project state.

Migration: Update custom agent configurations to include retrieval directives. The default agent templates already include this. See PR #1110.

Skill v2.0 frontmatter required

What changed: Skills must include validated YAML frontmatter.

Impact: Skills without frontmatter will fail validation.

Migration: Add frontmatter following the v2.0 template. See PR #1034 and PR #1084.


Under the hood

These changes improve the development experience but do not directly affect agent behavior:

  • Pre-PR validation blocks PRs that fail commit count limits, test coverage, and linting checks (PR #1093)
  • Pre-push hooks validate branch naming, run tests, and use fail-closed posture (PR #1094, PR #1047)
  • Hook output compression reduced terminal noise to single-line confirmations (PR #1122)
  • Context optimization trimmed AGENTS.md and added passive context compliance CI (PR #1120, PR #1124)
  • Security hardening resolved CWE vulnerabilities in session logs, hardened memory sync against injection (PR #1085, PR #1070)
  • PowerShell syntax validation in CI with PSScriptAnalyzer (PR #1040)
  • Session-end skill for automated pre-commit validation (PR #1086)

All changes

Features

  • Add context optimizer tooling suite with passive context analysis (#1116)
  • Add context optimizer tooling suite v0.4.0 (#1111)
  • Inject retrieval-led reasoning directives across all phases (#1110)
  • Add CI health reporting for memory enhancement (#1107)
  • Add graph traversal module for memory enhancement (#1104)
  • Add six engineering decision-making skills with infrastructure improvements (#1097)
  • Add pre-push hook and shift-left code quality into implementer (#1102)
  • Add citation schema tests, docs, and CI workflow (#1103)
  • Replace hardcoded skill patterns with runtime SKILL.md scanning (#1096)
  • Add pre-push git hook for comprehensive branch validation (#1094)
  • Add quality gate enforcement for PR readiness (#1093)
  • Context-retrieval auto-invocation in orchestrator (#1090)
  • Add 6 issues to v0.3.0 milestone from triage analysis (#1091)
  • Create session-end skill for pre-commit validation (#1086)
  • Update 34 skills to v2.0 compliance standard (#1084)
  • Consolidate allowlists, extract shared types, add security tests (#1080)
  • Implement Serena-Forgetful memory synchronization (#1048)
  • Implement citation schema and verification (Phase 1) (#1045)
  • Add consensus mechanisms for multi-agent decision resolution (#1035)
  • Add Velocity Accelerator workflow for development acceleration (#1037)
  • Implement toolset abstraction to reduce agent template duplication (#1036)
  • Add pre-commit validation for skill YAML frontmatter (#1034)
  • Add PowerShell syntax validation workflow to CI (#1040)
  • Extract Get-AllPRsWithComments to GitHubCore module (#1023)
  • Phase 2 Graph Traversal (#1019)
  • Implement passive context strategy for skill utilization (#1022)
  • Chain 3 complete, graph implementation and spec tooling (#1011)
  • Add graph traversal and confidence scoring (#1013)
  • Phase 1 Citation Schema and Verification (#1009)
  • Add v0.3.0 parallel execution harness (#1006)
  • Add SKILL.md for advanced Git workflows (#996)
  • Add stale comment detection to Get-PRReviewComments.ps1 (#987)
  • Add domain classification to PR review comments (#988)
  • ADR-043 scope session protocol tools to changed files (#986)
  • v0.3.1 P0 PowerShell-to-Python cleanup (#1113)
  • Chain 3 graph implementation, optimization, and tooling (#1012)

Fixes

  • Use portable Python for skill learning hook (#1095)
  • Use hardcoded main in pr-quality backtick commands (#1089)
  • Resolve CWE vulnerabilities in session log creation (#1085)
  • Harden memory sync against security findings (#1070)
  • Velocity accelerator returns 0 when no opportunities found (#1069)
  • Remove exit 2 from SessionStart hooks (cannot block) (#1067)
  • Harden hooks with fail-closed security posture (#1047)
  • Resolve pr-comment-responder skill validation failures (#1059)
  • Pre-execute tests in workflow and pass results to QA agent (#1038)
  • Repair broken USING-AGENTS and copilot-instructions links in README (#1042)
  • Pin Copilot CLI to 0.0.397 for frontmatter regression (#1024)

Refactoring

  • Context optimization items 4-10 (#1123)
  • Compress hook output to single-line confirmations (#1122, #1121)
  • Slim AGENTS.md and consolidate auto-loaded context (#1120)
  • Rename planning agents and add backlog-generator (#1105)
  • Migrate agent prompts from cloudmcp-manager to Memory Router (ADR-037) (#1046)

Performance

  • Optimize MemoryRouter search from ~260ms to ~10ms (#1044)

Docs

  • Create v0.3.1 PowerShell migration plan (#1114)
  • Add review artifacts for skill-learning-patterns feature (#1100)
  • Enhance engineering knowledge index with design patterns and modernization (#1092)
  • Document prompt vs agent file pattern (#1087)
  • v0.4.0 framework extraction plan, ADR-045, and agent documentation improvements (#1081)
  • Research Claude Code Agent Teams feature (#1082)
  • v0.4.0 framework extraction plan, ADR-045, and plugin marketplace research (#1079)
  • Apply evidence-based testing philosophy (#1071)
  • Add PowerShell-to-Python migration plan (v0.3.1) (#1068)
  • Add mandatory local workflow testing with gh act (#1043)
  • Document PSScriptAnalyzer in CONTRIBUTING.md pre-commit hooks section (#1039)
  • Close P3 issue #167 as superseded by Forgetful MCP (#1010)
  • Add Memory Interface Decision Matrix (#1007)
  • Update existing memories with PR #908 learnings (#991, #985)

CI / Dependencies

  • Add skill/passive context compliance workflow (#1124)
  • Add .coverage and .cache/ to .gitignore (#1099)
  • Bump pip in the uv group across 1 directory (#1033)
  • Bump anthropic in the python-dependencies group (#1032)
  • Bump the github-actions group with 2 updates (#1031)
  • Bump the python-dependencies group with 2 updates (#1020)
  • Bump the github-actions group with 6 updates (#1021)

Testing

  • Add comprehensive test infrastructure for ADR enforcement (#989)

Contributors

Thank you to everyone who made v0.3.0 happen.


What's next

With v0.3.0's memory and agent foundations in place, here is what we are focusing on next:

  • v0.3.1: PowerShell-to-Python migration cleanup for remaining scripts
  • v0.4.0: Framework extraction so the agent system can run as a standalone plugin ecosystem
  • Agent Teams: Parallel multi-agent execution using Claude Code Agent Teams
  • Context optimization: Continued reduction of passive context overhead

We are building in the open. Follow along on GitHub, file issues, and let us know what you are building with AI agents.


Full Changelog: v0.2.0...v0.3.0