09 Feb 10:50

Immutable

v0.3.0

db95c9a

AI Agents v0.3.0: Agents that remember Latest

Latest

AI Agents v0.3.0: Agents that remember

AI Agents v0.3.0 brings three new agents, a rebuilt memory system, and major prompt rewrites across the board. Your agents now retrieve project context before making decisions, cite their sources with confidence scores, and traverse a knowledge graph to find connections that keyword search misses. We also added 10+ new skills and 27 shared memories you can reference in your own agent configurations.

This release includes 77 commits across 139 closed issues. Let's walk through what changed and why it matters.

Released: February 2026

Three new agents
Smarter orchestrator and implementer
Agents that read before they think
Memories that cite their sources
New skills for your workflows
Shared memories for your agents
Template system improvements
Breaking changes
Under the hood
All changes
Contributors
What's next

Three new agents

We added three agents to the catalog, bringing the total to 21. All three are available on every platform (VS Code, Copilot CLI, Claude Code).

Debug is a diagnostic specialist for root cause analysis. Point it at a failing test, a stack trace, or a misbehaving endpoint and it traces the problem systematically instead of guessing.

Backlog generator proactively discovers work when agent slots are idle. It analyzes open issues, PR status, and code health to produce 3-5 sized, actionable tasks. Unlike the task-decomposer (which breaks down existing work items), the backlog generator finds new work that needs doing.

Janitor handles codebase hygiene: dead code detection, unused import cleanup, and consistency enforcement. It handles the tedious maintenance that accumulates between features.

See PR #1105 for all three agents.

Smarter orchestrator and implementer

The two agents you use most got significant prompt rewrites.

The orchestrator gained retrieval-led reasoning, parallel execution support, and tighter handoff protocols. It auto-invokes context retrieval at the start of each task so downstream agents work with current project state. The parallel execution harness lets multiple agents work on independent tasks simultaneously with proper coordination (PR #1006, PR #1090).

The implementer now runs pre-push quality checks inline instead of waiting for CI. It validates code quality, runs tests, and checks linting before you push. The prompt was expanded with explicit coding standards, testability requirements, and atomic commit guidance (PR #1102).

We also renamed two agents for clarity: planner became milestone-planner and task-generator became task-decomposer. The old names still work but will be removed in a future release.

Agents that read before they think

This is the biggest behavioral change in v0.3.0. Every agent phase, from orchestration through implementation, now retrieves current information before making decisions. Pre-training knowledge is the last resort, never the default.

In practice, this means your agents make decisions based on what your project actually looks like today, not what the model learned during training. The orchestrator auto-invokes context retrieval. Agent prompts migrated from cloudmcp-manager to the unified Memory Router (ADR-037), which centralizes prompt management and enables memory-aware prompt construction.

We also made the MemoryRouter 26x faster. Search dropped from ~260ms to ~10ms. Agents retrieve context in under 10 milliseconds, which means less waiting and more doing (PR #1044).

See PR #1110 and PR #1090 for retrieval-led reasoning. Prompt migration in PR #1046.

Memories that cite their sources

We rebuilt the memory system from the ground up. Every memory now carries structured citations that track where the knowledge came from, when it was captured, and how reliable it is. A verification pipeline validates citations against their original sources.

A new graph traversal module connects related memories so agents can walk the knowledge graph to find relevant context. Each edge carries a confidence score based on citation quality and usage frequency. An agent researching "retry patterns" can discover related memories about "resilience" and "circuit breakers" automatically.

We unified four separate memory interfaces (Serena, Forgetful, file-based, and in-context) into a single router. You no longer need to think about which backend to use. The MemoryRouter picks the right one based on the query.

Bidirectional sync between Serena and Forgetful keeps memories consistent across both backends. Changes in one system propagate to the other automatically.

See PR #1045, PR #1009, and PR #1103 for citations. Graph traversal in PR #1013, PR #1019, and PR #1104. Interface unification in PR #1007.

New skills for your workflows

We added 10+ new skills in .claude/skills/ that you can use in your own agent configurations:

Buy vs. Build Framework evaluates build, buy, partner, or defer decisions with TCO analysis and vendor scoring
Code Qualities Assessment scores cohesion, coupling, encapsulation, testability, and non-redundancy with calibrated rubrics
CVA Analysis (Commonality/Variability Analysis) discovers abstractions systematically before picking patterns
Context Optimizer analyzes skill content for optimal placement and compresses markdown by 60-80%
Security Scan detects CWE-22 (path traversal) and CWE-78 (command injection) patterns before PR submission
Doc Coverage finds missing documentation in code and project files
Git Advanced Workflows handles rebasing, cherry-picking, bisect, worktrees, and reflog
Memory Enhancement manages citations, verifies code references, and tracks confidence scores
Session End validates and completes session logs with pre-commit checks
Style Enforcement validates code against .editorconfig, StyleCop, and project conventions

We also updated 34 existing skills to the v2.0 compliance standard with validated YAML frontmatter and consistent structure (PR #1084).

Shared memories for your agents

We added 27 new memories in .serena/memories/ that your agents can reference for project-specific knowledge. Highlights include:

retrieval-led-reasoning documents when and how to retrieve context before reasoning
passive-context-vs-skills-vercel-research captures Vercel's findings on passive context achieving 100% pass rates vs 53-79% for skills
claude-code-agent-teams covers parallel multi-agent execution patterns
buy-vs-build-framework-skill captures the decision framework for build/buy/partner/defer
quality-gates-bypass-enforcement documents how to prevent agents from bypassing quality checks
rootcause-escape-hatch-misuse patterns for when agents misuse escape hatches

These memories are available to any agent using Serena. Point your agents at the memory-index to discover what is available.

Template system improvements

The template system in templates/ gained several improvements:

Toolset abstraction (templates/toolsets.yaml) centralizes tool declarations. Instead of duplicating tool lists across every agent template, agents now reference named toolsets. This makes it easier to add or modify tools across all agents at once (PR #1036).

Consensus mechanisms give agents a structured way to resolve disagreements during multi-agent workflows like ADR debates (PR #1035).

Agent template generation (build/Generate-Agents.ps1) now produces agents for all three platforms from shared templates. Edit the template once, regenerate, and all platforms update.

Breaking changes

Agent renames

What changed: planner renamed to milestone-planner. task-generator renamed to task-decomposer.

Impact: References to the old agent names will stop working once you update.

Migration: Update any custom configurations or scripts that reference planner or task-generator to use the new names. See PR #1105.

Retrieval-led reasoning is now mandatory

What changed: All agent phases must retrieve context before reasoning.

Impact: Agents that skip retrieval will produce lower-quality results because they rely on pre-training instead of current project state.

Migration: Update custom agent configurations to include retrieval directives. The default agent templates already include this. See PR #1110.

Skill v2.0 frontmatter required

What changed: Skills must include validated YAML frontmatter.

Impact: Skills without frontmatter ...

Assets 3

20 Jan 03:40

rjmurillo

Immutable

v0.2.0

0094394

v0.2.0

AI Agents v0.2.0: Python-First, Security-Strong 🐍🔒

Happy 2026! We're excited to bring you AI Agents v0.2.0, our biggest release yet with 56 pull requests delivering foundational improvements that set the stage for scalable, secure multi-agent development.

This release marks a strategic shift in our technical direction while maintaining our commitment to quality and developer experience. From migrating to Python-first architecture to expanding security coverage to 45+ vulnerability patterns, v0.2.0 brings substantial improvements across the board.

Released: January 20, 2026
Highlights: Python Migration • Security Expansion • Memory System Bootstrap • ADR Enforcement

Python-First Architecture Migration
Security Enhancements
Memory System Bootstrap
Infrastructure & CI/CD
Session Protocol Improvements
Engineering Knowledge System
Breaking Changes
All Changes
Contributors

Python-First Architecture Migration

The big news: We've officially migrated from PowerShell-first to Python-first architecture (ADR-042). This strategic decision aligns with the AI/ML ecosystem where Python dominates, giving us better library access, broader community support, and improved cross-platform compatibility.

What This Means

New scripts default to Python (.py) for better ecosystem integration
Existing PowerShell scripts grandfathered - no forced rewrites
Gradual migration path as we touch existing code
Better AI agent integration with native Python libraries

The migration happened in phases, starting with security tooling and expanding to installation scripts. All functionality preserved, all tests passing. 🎉

See PR #967 for the complete implementation.

Security Enhancements

Security got a major boost in v0.2.0 with 45 CWE patterns and full OWASP Agentic AI Top 10 coverage. Our security agent now catches:

New Detection Capabilities

Prompt injection attacks (CWE-1236)
Training data poisoning (CWE-345)
Model theft via API (CWE-212)
Sensitive data exposure (CWE-200, CWE-532)
Insecure output handling (CWE-79, CWE-89)

Plus improvements to existing checks:

Environment variable security now fails fast like other checks
CodeQL integration with multi-tier architecture
Pre-commit hooks detect Bash in security-critical paths

The security agent provides detailed remediation guidance for every finding, including:

Severity ratings based on context
Specific code locations with line numbers
Remediation steps with code examples
Related CWE/OWASP references

See PR #978 and PR #980 for details.

Memory System Bootstrap

We completed a systematic knowledge migration moving 462 learnings from Serena to Forgetful, our semantic memory system. This gives agents:

Knowledge Graph Benefits

Cross-session context - Agents remember patterns across sessions
Semantic search - Find relevant learnings by concept, not keywords
Relationship tracking - See how decisions connect
Provenance tracking - Know where knowledge came from

The memory system now includes:

Memory utility scripts for graph connectivity analysis
Improved markdown navigation with automatic linking
Merge mode for idempotent database imports
Corruption recovery tooling

This foundation enables agents to build on past learnings instead of starting fresh every session.

See PR #968 and PR #888 for implementation.

Bootstrapping Your Database

Want to give your agents the same knowledge base that powers this project? You can import all 462 learnings into your local Forgetful installation:

# Import the complete knowledge base
pwsh scripts/forgetful/Import-ForgetfulMemories.ps1

The import script offers three merge modes:

Replace (default): Merges with existing data, updating any conflicts
Skip: Only adds new records, preserving your existing data
Fail: Aborts if any duplicates found (strict mode)

What you get:

463 memories covering architecture decisions, security patterns, and agent workflows
1,385 memory links showing relationships between concepts
26 projects organizing knowledge by domain
24 entities (people, tools, frameworks)
Complete provenance tracking for all learnings

Safe and idempotent: You can run the import multiple times without data loss. The script validates all imports and provides detailed statistics on what was added or updated.

# Import with custom merge behavior
pwsh scripts/forgetful/Import-ForgetfulMemories.ps1 -MergeMode Skip

# Import specific files only
pwsh scripts/forgetful/Import-ForgetfulMemories.ps1 -InputFiles @('.forgetful/exports/2026-01-19.json')

After import, your agents will have instant access to the same patterns and learnings that went into building the v0.2.0 release.

Infrastructure & CI/CD

CodeQL Security Analysis

Integrated CodeQL with multi-tier architecture supporting Python and GitHub Actions. The system:

Detects 45+ security vulnerabilities automatically
Provides actionable SARIF output
Caches databases for faster subsequent runs
Handles language detection automatically

See PR #954.

GitHub Actions Testing

Added local validation infrastructure so you can test workflows before pushing:

Eliminates trial-and-error CI cycles
Catches YAML errors locally
Validates action syntax and parameters
Supports Docker-based execution

See PR #925.

Large PR Handling

Our AI review action now gracefully handles PRs with 300+ files through:

Intelligent chunking strategies
Graceful degradation for massive changes
Resource management improvements
Clear feedback when limits exceeded

See PR #891.

Session Protocol Improvements

Session management got more robust with verification-based enforcement:

What's New

Investigation-only validator (ADR-034 Phase 1) allows analysis sessions to skip QA
Session-start gates verify protocol compliance before work begins
Automatic session log creation with schema validation
650 orphaned files recovered and properly cataloged

The session protocol now catches compliance issues before they become merge conflicts, saving hours of rework.

See PR #931 and PR #924.

Engineering Knowledge System

Added 5 new skills capturing engineering best practices:

Three Horizons Framework - Strategic planning model
Wardley Mapping - Value chain visualization
Systems Archetypes - Recurring system patterns
Team Topologies - Organizational design patterns
Second-Order Thinking - Consequence analysis

Each skill includes:

Comprehensive documentation
Real-world examples
Integration tests
Usage guidelines

These skills help agents make better architectural decisions by applying proven frameworks.

See PR #977.

Breaking Changes

ADR-042: Python-First Policy

Impact: New scripts must be written in Python (.py) unless there's a compelling reason for PowerShell.

Migration: Existing PowerShell scripts are grandfathered. You only need to switch if you're creating new scripts or substantially rewriting existing ones.

Rationale: Python's dominance in AI/ML ecosystem, better library support, improved cross-platform compatibility.

See ADR-042 for complete details.

Skill-Installer Replaces Installation Scripts

Impact: Direct invocation of Install-*.ps1 scripts is deprecated in favor of the skill-installer pattern.

Migration:

# Old way
./scripts/forgetful/Install-ForgetfulLinux.ps1

# New way
.claude/skills/skill-installer/Invoke-SkillInstaller.ps1 -SkillName forgetful

Benefit: Version pinning, dependency management, consistent installation experience.

See PR #962 and documentation.

All Changes

Features ✨

Python-first architecture migration (ADR-042) #967
Security: 45 CWE patterns + OWASP Agentic Top 10 #978
Memory: 462 learnings migrated to Forgetful #968
CodeQL security analysis integration #954
Engineering knowledge skills (5 frameworks) #977
Reflect skill with auto-learning hook #908
Investigation-only session validator #931
Local GitHub Actions validation [#925](https://github.c...

Assets 3

18 Dec 08:09

rjmurillo-bot

Immutable

v0.0.1

53d3bc4

AI Agents v0.0.1: The Beginning 🚀

Welcome to the very first release of AI Agents! We've been building this system to solve a problem we kept running into: managing AI coding assistants across multiple platforms is a mess. Different configs, different agents, different capabilities—it was chaos.

So we fixed it.

Highlights

🤖 One Agent System, Three Platforms

The heart of this release is the unified multi-agent system. Whether you're using VS Code, GitHub Copilot CLI, or Claude Code, you now have access to the same 18 specialized agents with consistent behavior.

No more context-switching between tools. No more "wait, does Copilot have that agent?" moments. Just pick your platform and get to work.

🛡️ Security-First Design

We take security seriously. This release includes comprehensive CWE-78 remediation across all agent infrastructure. Every shell interaction, every hook, every script has been reviewed and hardened.

The result? An agent system you can actually trust with your codebase.

⚡ One-Line Installation

Getting started should be easy:

irm https://raw.githubusercontent.com/rjmurillo/ai-agents/main/install.ps1 | iex

That's it. Select your platform, and you're ready to go.

The Agent Catalog

We've built 18 agents, each with a specific job. Here's what's in the box:

Agent	What It Does
orchestrator	Routes tasks to the right specialist
implementer	Writes production-quality code
analyst	Investigates problems and gathers evidence
architect	Guards architectural decisions
planner	Breaks epics into milestones
critic	Validates plans before you waste time
qa	Verifies things actually work
explainer	Creates documentation humans can read
task-generator	Turns PRDs into atomic tasks
high-level-advisor	Cuts through decision paralysis
independent-thinker	Challenges assumptions (respectfully)
memory	Maintains cross-session context
skillbook	Manages learned patterns
retrospective	Extracts learnings after the fact
devops	Handles CI/CD and deployment
roadmap	Defines strategic direction
security	Threat modeling and vulnerability assessment
pr-comment-responder	Handles bot review feedback

What's Under the Hood

Multi-Agent Impact Analysis Framework

When you're planning something big, the new impact analysis framework coordinates reviews across domains. Architect reviews the design. Security checks for vulnerabilities. DevOps validates the pipeline impact. All before you write a single line of code.

Shared Template System

Behind the scenes, we've built a template system that generates platform-specific agent files from shared sources. This means when we improve an agent, the improvement lands everywhere.

MCP Config Sync

For Claude Code users, there's a new utility for keeping your MCP server configurations in sync. It's one of those "why didn't this exist before" tools.

Pre-Commit Architecture

Git hooks that actually help. Markdown linting, security scanning, the works—all running automatically before your commits go through.

Getting Started

Install: Run the one-liner above
Pick your platform: Claude Code, VS Code, or Copilot CLI
Start with orchestrator: It'll route you to the right agent

For detailed installation options, see the installation guide.

What's Next

This is v0.0.1, which means we're just getting started. We're staying in v0.x.x until the agents stabilize—expect things to evolve. The roadmap has the full picture, but here's what's coming:

2-variant consolidation: Reduce maintenance burden across platforms
Pre-PR Security Gate: Auto-route infrastructure changes to security review
Skill Management: Agents that actually learn from your codebase

Contributors

This release exists because of:

@rjmurillo — Primary author
@Copilot — Assisted with Phase 1 & 2 fixes

Full Changelog

See the complete commit history for all the details.

Key PRs:

#52 - MCP config sync utility
#50 - Phase 3 process improvements
#49 - Cross-document validation
#47 - Security and documentation fixes
#46 - ROOT delegation model
#43 - Shared template system
#41 - Unified install script
#40 - Multi-Agent Impact Analysis
#32 - Ideation workflow & CodeRabbit optimization

Happy automating! 🎉

Assets 3

Uh oh!

Releases: rjmurillo/ai-agents