Releases: rjmurillo/ai-agents
AI Agents v0.3.0: Agents that remember
AI Agents v0.3.0: Agents that remember
AI Agents v0.3.0 brings three new agents, a rebuilt memory system, and major prompt rewrites across the board. Your agents now retrieve project context before making decisions, cite their sources with confidence scores, and traverse a knowledge graph to find connections that keyword search misses. We also added 10+ new skills and 27 shared memories you can reference in your own agent configurations.
This release includes 77 commits across 139 closed issues. Let's walk through what changed and why it matters.
Released: February 2026
Table of Contents
- Three new agents
- Smarter orchestrator and implementer
- Agents that read before they think
- Memories that cite their sources
- New skills for your workflows
- Shared memories for your agents
- Template system improvements
- Breaking changes
- Under the hood
- All changes
- Contributors
- What's next
Three new agents
We added three agents to the catalog, bringing the total to 21. All three are available on every platform (VS Code, Copilot CLI, Claude Code).
Debug is a diagnostic specialist for root cause analysis. Point it at a failing test, a stack trace, or a misbehaving endpoint and it traces the problem systematically instead of guessing.
Backlog generator proactively discovers work when agent slots are idle. It analyzes open issues, PR status, and code health to produce 3-5 sized, actionable tasks. Unlike the task-decomposer (which breaks down existing work items), the backlog generator finds new work that needs doing.
Janitor handles codebase hygiene: dead code detection, unused import cleanup, and consistency enforcement. It handles the tedious maintenance that accumulates between features.
See PR #1105 for all three agents.
Smarter orchestrator and implementer
The two agents you use most got significant prompt rewrites.
The orchestrator gained retrieval-led reasoning, parallel execution support, and tighter handoff protocols. It auto-invokes context retrieval at the start of each task so downstream agents work with current project state. The parallel execution harness lets multiple agents work on independent tasks simultaneously with proper coordination (PR #1006, PR #1090).
The implementer now runs pre-push quality checks inline instead of waiting for CI. It validates code quality, runs tests, and checks linting before you push. The prompt was expanded with explicit coding standards, testability requirements, and atomic commit guidance (PR #1102).
We also renamed two agents for clarity: planner became milestone-planner and task-generator became task-decomposer. The old names still work but will be removed in a future release.
Agents that read before they think
This is the biggest behavioral change in v0.3.0. Every agent phase, from orchestration through implementation, now retrieves current information before making decisions. Pre-training knowledge is the last resort, never the default.
In practice, this means your agents make decisions based on what your project actually looks like today, not what the model learned during training. The orchestrator auto-invokes context retrieval. Agent prompts migrated from cloudmcp-manager to the unified Memory Router (ADR-037), which centralizes prompt management and enables memory-aware prompt construction.
We also made the MemoryRouter 26x faster. Search dropped from ~260ms to ~10ms. Agents retrieve context in under 10 milliseconds, which means less waiting and more doing (PR #1044).
See PR #1110 and PR #1090 for retrieval-led reasoning. Prompt migration in PR #1046.
Memories that cite their sources
We rebuilt the memory system from the ground up. Every memory now carries structured citations that track where the knowledge came from, when it was captured, and how reliable it is. A verification pipeline validates citations against their original sources.
A new graph traversal module connects related memories so agents can walk the knowledge graph to find relevant context. Each edge carries a confidence score based on citation quality and usage frequency. An agent researching "retry patterns" can discover related memories about "resilience" and "circuit breakers" automatically.
We unified four separate memory interfaces (Serena, Forgetful, file-based, and in-context) into a single router. You no longer need to think about which backend to use. The MemoryRouter picks the right one based on the query.
Bidirectional sync between Serena and Forgetful keeps memories consistent across both backends. Changes in one system propagate to the other automatically.
See PR #1045, PR #1009, and PR #1103 for citations. Graph traversal in PR #1013, PR #1019, and PR #1104. Interface unification in PR #1007.
New skills for your workflows
We added 10+ new skills in .claude/skills/ that you can use in your own agent configurations:
- Buy vs. Build Framework evaluates build, buy, partner, or defer decisions with TCO analysis and vendor scoring
- Code Qualities Assessment scores cohesion, coupling, encapsulation, testability, and non-redundancy with calibrated rubrics
- CVA Analysis (Commonality/Variability Analysis) discovers abstractions systematically before picking patterns
- Context Optimizer analyzes skill content for optimal placement and compresses markdown by 60-80%
- Security Scan detects CWE-22 (path traversal) and CWE-78 (command injection) patterns before PR submission
- Doc Coverage finds missing documentation in code and project files
- Git Advanced Workflows handles rebasing, cherry-picking, bisect, worktrees, and reflog
- Memory Enhancement manages citations, verifies code references, and tracks confidence scores
- Session End validates and completes session logs with pre-commit checks
- Style Enforcement validates code against .editorconfig, StyleCop, and project conventions
We also updated 34 existing skills to the v2.0 compliance standard with validated YAML frontmatter and consistent structure (PR #1084).
Shared memories for your agents
We added 27 new memories in .serena/memories/ that your agents can reference for project-specific knowledge. Highlights include:
- retrieval-led-reasoning documents when and how to retrieve context before reasoning
- passive-context-vs-skills-vercel-research captures Vercel's findings on passive context achieving 100% pass rates vs 53-79% for skills
- claude-code-agent-teams covers parallel multi-agent execution patterns
- buy-vs-build-framework-skill captures the decision framework for build/buy/partner/defer
- quality-gates-bypass-enforcement documents how to prevent agents from bypassing quality checks
- rootcause-escape-hatch-misuse patterns for when agents misuse escape hatches
These memories are available to any agent using Serena. Point your agents at the memory-index to discover what is available.
Template system improvements
The template system in templates/ gained several improvements:
Toolset abstraction (templates/toolsets.yaml) centralizes tool declarations. Instead of duplicating tool lists across every agent template, agents now reference named toolsets. This makes it easier to add or modify tools across all agents at once (PR #1036).
Consensus mechanisms give agents a structured way to resolve disagreements during multi-agent workflows like ADR debates (PR #1035).
Agent template generation (build/Generate-Agents.ps1) now produces agents for all three platforms from shared templates. Edit the template once, regenerate, and all platforms update.
Breaking changes
Agent renames
What changed: planner renamed to milestone-planner. task-generator renamed to task-decomposer.
Impact: References to the old agent names will stop working once you update.
Migration: Update any custom configurations or scripts that reference planner or task-generator to use the new names. See PR #1105.
Retrieval-led reasoning is now mandatory
What changed: All agent phases must retrieve context before reasoning.
Impact: Agents that skip retrieval will produce lower-quality results because they rely on pre-training instead of current project state.
Migration: Update custom agent configurations to include retrieval directives. The default agent templates already include this. See PR #1110.
Skill v2.0 frontmatter required
What changed: Skills must include validated YAML frontmatter.
Impact: Skills without frontmatter ...
v0.2.0
AI Agents v0.2.0: Python-First, Security-Strong 🐍🔒
Happy 2026! We're excited to bring you AI Agents v0.2.0, our biggest release yet with 56 pull requests delivering foundational improvements that set the stage for scalable, secure multi-agent development.
This release marks a strategic shift in our technical direction while maintaining our commitment to quality and developer experience. From migrating to Python-first architecture to expanding security coverage to 45+ vulnerability patterns, v0.2.0 brings substantial improvements across the board.
Released: January 20, 2026
Highlights: Python Migration • Security Expansion • Memory System Bootstrap • ADR Enforcement
Table of Contents
- Python-First Architecture Migration
- Security Enhancements
- Memory System Bootstrap
- Infrastructure & CI/CD
- Session Protocol Improvements
- Engineering Knowledge System
- Breaking Changes
- All Changes
- Contributors
Python-First Architecture Migration
The big news: We've officially migrated from PowerShell-first to Python-first architecture (ADR-042). This strategic decision aligns with the AI/ML ecosystem where Python dominates, giving us better library access, broader community support, and improved cross-platform compatibility.
What This Means
- New scripts default to Python (.py) for better ecosystem integration
- Existing PowerShell scripts grandfathered - no forced rewrites
- Gradual migration path as we touch existing code
- Better AI agent integration with native Python libraries
The migration happened in phases, starting with security tooling and expanding to installation scripts. All functionality preserved, all tests passing. 🎉
See PR #967 for the complete implementation.
Security Enhancements
Security got a major boost in v0.2.0 with 45 CWE patterns and full OWASP Agentic AI Top 10 coverage. Our security agent now catches:
New Detection Capabilities
- Prompt injection attacks (CWE-1236)
- Training data poisoning (CWE-345)
- Model theft via API (CWE-212)
- Sensitive data exposure (CWE-200, CWE-532)
- Insecure output handling (CWE-79, CWE-89)
Plus improvements to existing checks:
- Environment variable security now fails fast like other checks
- CodeQL integration with multi-tier architecture
- Pre-commit hooks detect Bash in security-critical paths
The security agent provides detailed remediation guidance for every finding, including:
- Severity ratings based on context
- Specific code locations with line numbers
- Remediation steps with code examples
- Related CWE/OWASP references
See PR #978 and PR #980 for details.
Memory System Bootstrap
We completed a systematic knowledge migration moving 462 learnings from Serena to Forgetful, our semantic memory system. This gives agents:
Knowledge Graph Benefits
- Cross-session context - Agents remember patterns across sessions
- Semantic search - Find relevant learnings by concept, not keywords
- Relationship tracking - See how decisions connect
- Provenance tracking - Know where knowledge came from
The memory system now includes:
- Memory utility scripts for graph connectivity analysis
- Improved markdown navigation with automatic linking
- Merge mode for idempotent database imports
- Corruption recovery tooling
This foundation enables agents to build on past learnings instead of starting fresh every session.
See PR #968 and PR #888 for implementation.
Bootstrapping Your Database
Want to give your agents the same knowledge base that powers this project? You can import all 462 learnings into your local Forgetful installation:
# Import the complete knowledge base
pwsh scripts/forgetful/Import-ForgetfulMemories.ps1The import script offers three merge modes:
- Replace (default): Merges with existing data, updating any conflicts
- Skip: Only adds new records, preserving your existing data
- Fail: Aborts if any duplicates found (strict mode)
What you get:
- 463 memories covering architecture decisions, security patterns, and agent workflows
- 1,385 memory links showing relationships between concepts
- 26 projects organizing knowledge by domain
- 24 entities (people, tools, frameworks)
- Complete provenance tracking for all learnings
Safe and idempotent: You can run the import multiple times without data loss. The script validates all imports and provides detailed statistics on what was added or updated.
# Import with custom merge behavior
pwsh scripts/forgetful/Import-ForgetfulMemories.ps1 -MergeMode Skip
# Import specific files only
pwsh scripts/forgetful/Import-ForgetfulMemories.ps1 -InputFiles @('.forgetful/exports/2026-01-19.json')After import, your agents will have instant access to the same patterns and learnings that went into building the v0.2.0 release.
Infrastructure & CI/CD
CodeQL Security Analysis
Integrated CodeQL with multi-tier architecture supporting Python and GitHub Actions. The system:
- Detects 45+ security vulnerabilities automatically
- Provides actionable SARIF output
- Caches databases for faster subsequent runs
- Handles language detection automatically
See PR #954.
GitHub Actions Testing
Added local validation infrastructure so you can test workflows before pushing:
- Eliminates trial-and-error CI cycles
- Catches YAML errors locally
- Validates action syntax and parameters
- Supports Docker-based execution
See PR #925.
Large PR Handling
Our AI review action now gracefully handles PRs with 300+ files through:
- Intelligent chunking strategies
- Graceful degradation for massive changes
- Resource management improvements
- Clear feedback when limits exceeded
See PR #891.
Session Protocol Improvements
Session management got more robust with verification-based enforcement:
What's New
- Investigation-only validator (ADR-034 Phase 1) allows analysis sessions to skip QA
- Session-start gates verify protocol compliance before work begins
- Automatic session log creation with schema validation
- 650 orphaned files recovered and properly cataloged
The session protocol now catches compliance issues before they become merge conflicts, saving hours of rework.
Engineering Knowledge System
Added 5 new skills capturing engineering best practices:
- Three Horizons Framework - Strategic planning model
- Wardley Mapping - Value chain visualization
- Systems Archetypes - Recurring system patterns
- Team Topologies - Organizational design patterns
- Second-Order Thinking - Consequence analysis
Each skill includes:
- Comprehensive documentation
- Real-world examples
- Integration tests
- Usage guidelines
These skills help agents make better architectural decisions by applying proven frameworks.
See PR #977.
Breaking Changes
ADR-042: Python-First Policy
Impact: New scripts must be written in Python (.py) unless there's a compelling reason for PowerShell.
Migration: Existing PowerShell scripts are grandfathered. You only need to switch if you're creating new scripts or substantially rewriting existing ones.
Rationale: Python's dominance in AI/ML ecosystem, better library support, improved cross-platform compatibility.
See ADR-042 for complete details.
Skill-Installer Replaces Installation Scripts
Impact: Direct invocation of Install-*.ps1 scripts is deprecated in favor of the skill-installer pattern.
Migration:
# Old way
./scripts/forgetful/Install-ForgetfulLinux.ps1
# New way
.claude/skills/skill-installer/Invoke-SkillInstaller.ps1 -SkillName forgetfulBenefit: Version pinning, dependency management, consistent installation experience.
See PR #962 and documentation.
All Changes
Features ✨
- Python-first architecture migration (ADR-042) #967
- Security: 45 CWE patterns + OWASP Agentic Top 10 #978
- Memory: 462 learnings migrated to Forgetful #968
- CodeQL security analysis integration #954
- Engineering knowledge skills (5 frameworks) #977
- Reflect skill with auto-learning hook #908
- Investigation-only session validator #931
- Local GitHub Actions validation [#925](https://github.c...
AI Agents v0.0.1: The Beginning 🚀
Welcome to the very first release of AI Agents! We've been building this system to solve a problem we kept running into: managing AI coding assistants across multiple platforms is a mess. Different configs, different agents, different capabilities—it was chaos.
So we fixed it.
Highlights
🤖 One Agent System, Three Platforms
The heart of this release is the unified multi-agent system. Whether you're using VS Code, GitHub Copilot CLI, or Claude Code, you now have access to the same 18 specialized agents with consistent behavior.
No more context-switching between tools. No more "wait, does Copilot have that agent?" moments. Just pick your platform and get to work.
🛡️ Security-First Design
We take security seriously. This release includes comprehensive CWE-78 remediation across all agent infrastructure. Every shell interaction, every hook, every script has been reviewed and hardened.
The result? An agent system you can actually trust with your codebase.
⚡ One-Line Installation
Getting started should be easy:
irm https://raw.githubusercontent.com/rjmurillo/ai-agents/main/install.ps1 | iexThat's it. Select your platform, and you're ready to go.
The Agent Catalog
We've built 18 agents, each with a specific job. Here's what's in the box:
| Agent | What It Does |
|---|---|
| orchestrator | Routes tasks to the right specialist |
| implementer | Writes production-quality code |
| analyst | Investigates problems and gathers evidence |
| architect | Guards architectural decisions |
| planner | Breaks epics into milestones |
| critic | Validates plans before you waste time |
| qa | Verifies things actually work |
| explainer | Creates documentation humans can read |
| task-generator | Turns PRDs into atomic tasks |
| high-level-advisor | Cuts through decision paralysis |
| independent-thinker | Challenges assumptions (respectfully) |
| memory | Maintains cross-session context |
| skillbook | Manages learned patterns |
| retrospective | Extracts learnings after the fact |
| devops | Handles CI/CD and deployment |
| roadmap | Defines strategic direction |
| security | Threat modeling and vulnerability assessment |
| pr-comment-responder | Handles bot review feedback |
What's Under the Hood
Multi-Agent Impact Analysis Framework
When you're planning something big, the new impact analysis framework coordinates reviews across domains. Architect reviews the design. Security checks for vulnerabilities. DevOps validates the pipeline impact. All before you write a single line of code.
Shared Template System
Behind the scenes, we've built a template system that generates platform-specific agent files from shared sources. This means when we improve an agent, the improvement lands everywhere.
MCP Config Sync
For Claude Code users, there's a new utility for keeping your MCP server configurations in sync. It's one of those "why didn't this exist before" tools.
Pre-Commit Architecture
Git hooks that actually help. Markdown linting, security scanning, the works—all running automatically before your commits go through.
Getting Started
- Install: Run the one-liner above
- Pick your platform: Claude Code, VS Code, or Copilot CLI
- Start with orchestrator: It'll route you to the right agent
For detailed installation options, see the installation guide.
What's Next
This is v0.0.1, which means we're just getting started. We're staying in v0.x.x until the agents stabilize—expect things to evolve. The roadmap has the full picture, but here's what's coming:
- 2-variant consolidation: Reduce maintenance burden across platforms
- Pre-PR Security Gate: Auto-route infrastructure changes to security review
- Skill Management: Agents that actually learn from your codebase
Contributors
This release exists because of:
- @rjmurillo — Primary author
- @Copilot — Assisted with Phase 1 & 2 fixes
Full Changelog
See the complete commit history for all the details.
Key PRs:
- #52 - MCP config sync utility
- #50 - Phase 3 process improvements
- #49 - Cross-document validation
- #47 - Security and documentation fixes
- #46 - ROOT delegation model
- #43 - Shared template system
- #41 - Unified install script
- #40 - Multi-Agent Impact Analysis
- #32 - Ideation workflow & CodeRabbit optimization
Happy automating! 🎉