All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.5.8 - 2026-02-20
ResolvedModeltype and provider gateway support inresolveModel()— resolvesModelRefstrings (e.g.,openrouter/openai/gpt-5.3) through configured provider gateways withbaseUrlandauthTokenEnv- Provider and model validation in
validateConfig()— validates provider types (native/gateway), required gateway fields (baseUrl), and model reference format at config load time - Provider environment variables now threaded through all agent spawn commands (
sling,coordinator,supervisor,monitor) — gatewayauthTokenEnvvalues are passed to spawned agent processes
- Auto-infer mulch domains from file scope in
overstory sling—inferDomainsFromFiles()maps file paths to domains (e.g.,src/commands/*.ts→cli,src/agents/*.ts→agents) instead of always using configured defaults - Outcome flags for
MulchClient.record()—--outcome-status,--outcome-duration,--outcome-test-results,--outcome-agentfor structured outcome tracking - File-scoped search in
MulchClient.search()—--fileand--sort-by-scoreoptions for targeted expertise queries - PostToolUse Bash hook in hooks template and init — runs
mulch diffafter git commits to auto-detect expertise changes
- Builder completion protocol includes outcome data flags (
--outcome-status success --outcome-agent $OVERSTORY_AGENT_NAME) - Lead and supervisor agents get file-scoped mulch search capability (
mulch search <query> --file <path>) - Overlay quality gates include outcome flags for mulch recording
limitoption added toMailStore.getAll()— dashboard now fetches only the most recent messages instead of the full mailbox- Persistent DB connections across dashboard poll ticks —
SessionStore,EventStore,MailStore, andMetricsStoreconnections are now opened once and reused, eliminating per-tick open/close overhead
- Test suite grew from 1916 to 1996 tests across 73 files (4960 expect() calls)
- Zombie agent recovery —
updateLastActivitynow recovers agents from "zombie" state when hooks prove they're alive (previously only recovered from "booting") - Dashboard
.repeat()crash when negative values were passed — now clamps repeat count to minimum of 0 - Set-based tmux session lookup in
status.tsreplacing O(n) array scans with O(1) Set membership checks - Subprocess cache in
status.tspreventing redundanttmux list-sessionscalls during a single status gather - Null-runId sessions (coordinator) now included in run-scoped status and dashboard views — previously filtered out when
--allwas not specified - Sparse file used in logs doctor test to prevent timeout on large log directory scans
- Beacon submission reliability — replaced fixed sleep with poll-based TUI readiness check (PR #19, thanks @dmfaux!)
- Biome formatting in hooks-deployer test and sling
0.5.7 - 2026-02-19
ModelAlias,ModelRef, andProviderConfigtypes intypes.ts— foundation for multi-provider model routing (nativeandgatewayprovider types withbaseUrlandauthTokenEnvconfiguration)providersfield inOverstoryConfig—Record<string, ProviderConfig>for configuring model providers per projectresolveModel()signature updated to acceptModelRef(provider-qualified strings likeopenrouter/openai/gpt-5.3) alongside simpleModelAliasvalues
--selfflag foroverstory costs— parse the current orchestrator session's Claude Code transcript directly, bypassing metrics.db, useful for real-time cost visibility without agent infrastructure
run_idcolumn added tometrics.dbsessions table — enablesoverstory costs --run <id>filtering to work correctly; includes automatic migration for existing databases
- Phase-aware
buildCompletionMessage()in watchdog daemon — generates targeted completion nudge messages based on worker capability composition (single-capability batches get phase-specific messages like "Ready for next phase", mixed batches get a summary with breakdown)
- Test suite grew from 1892 to 1916 tests across 73 files (4866 expect() calls)
0.5.6 - 2026-02-18
- Root-user pre-flight guard on all agent spawn commands (
sling,coordinator start,supervisor start,monitor start) — blocks spawning when running as UID 0, since theclaudeCLI rejects--dangerously-skip-permissionsas root causing tmux sessions to die immediately - Unmerged branch safety check in
overstory worktree clean— skips worktrees with unmerged branches by default, warns about skipped branches, and requires--forceto delete them
.overstory/README.mdgeneration duringoverstory init— explains the directory to contributors who encounter.overstory/in a project, whitelisted in.gitignore
overstory monitor startnow gates onwatchdog.tier2Enabledconfig flag — throws a clear error when Tier 2 is disabled instead of silently proceedingoverstory coordinator start --monitorrespectstier2Enabled— skips monitor auto-start with a message when disabled
sendKeysnow distinguishes "tmux server not running" from "session not found" — provides actionable error messages for each case (e.g., root-user hint for server-not-running)
- Lead agent definition (
agents/lead.md) reframed as coordinator-not-doer — emphasizes the lead's role as a delegation specialist rather than an implementer
- Test suite grew from 1868 to 1892 tests across 73 files (4807 expect() calls)
- Biome formatting in merged builder code
0.5.5 - 2026-02-18
overstory statusnow scopes to the current run by default with--allflag to show all runs —gatherStatus()filters sessions byrunIdwhen presentoverstory dashboardnow scopes all panels to the current run by default with--allflag to show data across all runs
config.local.yamlsupport for machine-specific configuration overrides — values inconfig.local.yamlare deep-merged overconfig.yaml, allowing per-machine settings (model overrides, paths, watchdog intervals) without modifying the tracked config file (PR #9)
- PreToolUse hooks template now includes a universal
git pushguard — blocks allgit pushcommands for all agents (previously only blocked push to canonical branches)
- Watchdog daemon tick now detects when all agents in the current run have completed and auto-reports run completion
- Lead agents now stream
merge_readymessages per-builder as each completes, instead of batching all merge signals — enables earlier merge pipeline starts
- Added
issue-reviewsandpr-reviewsskills for reviewing GitHub issues and pull requests from within Claude Code
- Test suite grew from 1848 to 1868 tests across 73 files (4771 expect() calls)
overstory slingnow usesresolveModel()for config-level model overrides — previously ignoredmodels:config section when spawning agentsoverstory doctordependency check now detectsbdCGO/Dolt backend failures — catches cases wherebdbinary exists but crashes due to missing CGO dependencies (PR #11)- Biome line width formatting in
src/doctor/consistency.ts
0.5.4 - 2026-02-17
- Reviewer-coverage doctor check in
overstory doctor— warns when leads spawn builders without corresponding reviewers, reports partial coverage ratios per lead merge_readyreviewer validation inoverstory mail send— advisory warning when sendingmerge_readywithout reviewer sessions for the sender's builders
- Scout-before-builder warning in
overstory sling— warns when a lead spawns a builder without having spawned any scouts first parentHasScouts()helper exported from sling for testability
overstory coordinator stopnow auto-completes the active run (readscurrent-run.txt, marks run completed, cleans up)overstory log session-endauto-completes the run when the coordinator exits (handles tmux window close without explicit stop)
.overstory/.gitignoreflipped from explicit blocklist to wildcard*+ whitelist pattern — ignore everything, whitelist only tracked files (config.yaml,agent-manifest.json,hooks.json,groups.json,agent-defs/)overstory primeauto-heals.overstory/.gitignoreon each session start — ensures existing projects get the updated gitignoreOVERSTORY_GITIGNOREconstant andwriteOverstoryGitignore()exported from init.ts for reuse
- Test suite grew from 1812 to 1848 tests across 73 files (4726 expect() calls)
- Lead agent definition (
agents/lead.md) — scouts made mandatory (not optional), Phase 3 review made MANDATORY with stronger language, addedSCOUT_SKIPfailure mode, expanded cost awareness section explaining why scouts and reviewers are investments not overhead overstory init.gitignore now always overwrites (supports--forcereinit and auto-healing)
- Hooks template (
templates/hooks.json.tmpl) — removed fragileread -r INPUT; echo "$INPUT" |stdin relay pattern;overstory lognow reads stdin directly via--stdinflag readStdinJson()in log command — reads all stdin chunks for large payloads instead of only the first line- Doctor gitignore structure check updated for wildcard+whitelist model
0.5.3 - 2026-02-17
models:section inconfig.yaml— override the default model (sonnet,opus,haiku) for any agent role (coordinator, supervisor, monitor, etc.)resolveModel()helper in agent manifest — resolution chain: config override > manifest default > fallback- Supervisor and monitor entries added to
agent-manifest.jsonwith model and capability metadata overstory initnow seeds the defaultmodels:section in generatedconfig.yaml
- Test suite grew from 1805 to 1812 tests across 73 files (4638 expect() calls)
0.5.2 - 2026-02-17
--into <branch>flag foroverstory merge— target a specific branch instead of always merging to canonicalBranch
overstory primenow records the orchestrator's starting branch to.overstory/session-branch.txtat session startoverstory mergereadssession-branch.txtas the default merge target when--intois not specified — resolution chain:--intoflag >session-branch.txt> configcanonicalBranch
- Test suite grew from 1793 to 1805 tests across 73 files (4615 expect() calls)
- Git push blocking for agents now blocks ALL
git pushcommands (previously only blocked push to canonical branches) — agents should useoverstory mergeinstead - Init-deployed hooks now include a PreToolUse Bash guard that blocks
git pushfor the orchestrator's project
- Test cwd pollution in agents test afterEach — restored cwd to prevent cross-file pollution
0.5.1 - 2026-02-16
overstory agents discover— discover and query agents by capability, state, file scope, and parent with--capability,--state,--parentfilters and--jsonoutput
- Session insight analyzer (
src/insights/analyzer.ts) — analyzes EventStore data from completed sessions to extract structured patterns about tool usage, file edits, and errors for automatic mulch expertise recording - Conflict history intelligence in merge resolver — tracks past conflict resolution patterns per file to skip historically-failing tiers and enrich AI resolution prompts with successful strategies
- INSIGHT recording protocol for agent definitions — read-only agents (scout, reviewer) use INSIGHT prefix for structured expertise observations; parent agents (lead, supervisor) record insights to mulch automatically
- Test suite grew from 1749 to 1793 tests across 73 files (4587 expect() calls)
session-endhook now callsmulch recorddirectly instead of sendingmulch_learnmail messages — removes mail indirection for expertise recording
- Coordinator tests now always inject fake monitor/watchdog for proper isolation
0.5.0 - 2026-02-16
overstory feed— unified real-time event stream across all agents with--followmode for continuous polling, agent/run filtering, and JSON outputoverstory logs— query NDJSON log files across agents with level filtering (--level), time range queries (--since/--until), and--followtail modeoverstory costs --live— real-time token usage display for active agents
--monitorflag forcoordinator start/stop/status— manage the Tier 2 monitor agent alongside the coordinator
- Mulch recording as required completion gate for all agent types — agents must record learnings before session close
- Mulch learn extraction added to Stop hooks for orchestrator and all agents
- Scout-spawning made default in lead.md Phase 1 with parallel support
- Reviewer spawning made mandatory in lead.md
- Real-time token tracking infrastructure (
src/metrics/store.ts,src/commands/costs.ts) — live session cost monitoring via transcript JSONL parsing
- Test suite grew from 1673 to 1749 tests across 71 files (4460 expect() calls)
- Duplicate
feedentry in CLI command router and help text
0.4.1 - 2026-02-16
overstory --completions <shell>— shell completion generation for bash, zsh, and fish--quiet/-qglobal flag — suppress non-error output across all commandsoverstory mail send --to @all— broadcast messaging with group addresses (@all,@builders,@scouts,@reviewers,@leads,@mergers, etc.)
- Central
NO_COLORconvention support (src/logging/color.ts) — respectsNO_COLOR,FORCE_COLOR, andTERM=dumbenvironment variables per https://no-color.org - All ANSI color output now goes through centralized color module instead of inline escape codes
- Merge queue migrated from JSON file to SQLite (
merge-queue.db) for durability and concurrent access
- Test suite grew from 1612 to 1673 tests across 69 files (4267 expect() calls)
- Freeze duration counter for completed/zombie agents in status and dashboard displays
0.4.0 - 2026-02-15
overstory doctor— comprehensive health check system with 9 check modules (dependencies, config, structure, databases, consistency, agents, merge-queue, version, logs) and formatted output with pass/warn/fail statusoverstory inspect <agent>— deep per-agent inspection aggregating session data, metrics, events, and live tmux capture with--followpolling mode
--watchdogflag forcoordinator start— auto-starts the watchdog daemon alongside the coordinator--debounce <ms>flag formail check— prevents excessive mail checking by skipping if called within the debounce window- PostToolUse hook entry for debounced mail checking
- Automated failure recording in watchdog via mulch — records failure patterns for future reference
- Mulch learn extraction in
log session-end— captures session insights automatically - Mulch health checks in
overstory clean— validates mulch installation and domain health during cleanup
- Test suite grew from 1435 to 1612 tests across 66 files (3958 expect() calls)
- Wire doctor command into CLI router and update command groups
0.3.0 - 2026-02-13
overstory runcommand — orchestration run lifecycle management (list,show,completesubcommands) with RunStore backed by sessions.dboverstory tracecommand — agent/bead timeline viewing for debugging and post-mortem observabilityoverstory cleancommand — cleanup worktrees, sessions, and artifacts with auto-cleanup on agent teardown
- Run tracking via
run_idintegrated into sling and clean commands RunStorein sessions.db for durable run stateSessionStore(SQLite) — migrated from sessions.json for concurrent access and crash safety- Phase 2 CLI query commands and Phase 3 event persistence for the observability pipeline
- Project-scoped tmux naming (
overstory-{projectName}-{agentName}) to prevent cross-project session collisions ENV_GUARDon all hooks — prevents hooks from firing outside overstory-managed worktrees- Mulch-informed lead decomposition — leader agents use mulch expertise when breaking down tasks
- Mulch conflict pattern recording — merge resolver records conflict patterns to mulch for future reference
- New commands and flags for the mulch CLI wrapper
--jsonparsing support with corrected types and flag spread
STEELMAN.md— comprehensive risk analysis for agent swarm deployments- Community files: CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md
- Package metadata (keywords, repository, homepage) for npm/GitHub presence
- Test suite grew from 912 to 1435 tests across 55 files (3416 expect() calls)
- Fix
isCanonicalRootguard blocking all worktree overlays when dogfooding overstory on itself - Fix auto-nudge tmux corruption and deploy coordinator hooks correctly
- Fix 4 P1 issues: orchestrator nudge routing, bash guard bypass, hook capture isolation, overlay guard
- Fix 4 P1/P2 issues: ENV_GUARD enforcement, persistent agent state, project-scoped tmux kills, auto-nudge coordinator
- Strengthen agent orchestration with additional P1 bug fixes
- CLI commands grew from 17 to 20 (added run, trace, clean)
0.2.0 - 2026-02-13
overstory coordinatorcommand — persistent orchestrator that runs at project root, decomposes objectives into subtasks, dispatches agents via sling, and tracks batches via task groupsstart/stop/statussubcommands--attach/--no-attachwith TTY-aware auto-detection for tmux sessions- Scout-delegated spec generation for complex tasks
- Supervisor agent definition — per-project team lead (depth 1) that receives dispatch mail from coordinator, decomposes into worker-sized subtasks, manages worker lifecycle, and escalates unresolvable issues
- 7 base agent types (added coordinator + supervisor to existing scout, builder, reviewer, lead, merger)
overstory groupcommand — batch coordination (create/status/add/remove/list) with auto-close when all member beads issues complete, mail notification to coordinator on auto-close- Session checkpoint save/restore for compaction survivability (
prime --compactrestores from checkpoint) - Handoff orchestration (initiate/resume/complete) for crash recovery
- 8 protocol message types:
worker_done,merge_ready,merged,merge_failed,escalation,health_check,dispatch,assign - Type-safe
sendProtocol<T>()andparsePayload<T>()for structured agent coordination - JSON payload column with schema migration handling 3 upgrade paths
overstory nudgecommand with retry (3x), debounce (500ms), and--forceto skip debounce- Auto-nudge on urgent/high priority mail send
- PreToolUse hooks mechanically block file-modifying tools (Write/Edit/NotebookEdit) for non-implementation agents (scout, reviewer, coordinator, supervisor)
- PreToolUse Bash guards block dangerous git operations (
push,reset --hard,clean -f, etc.) for all agents - Whitelist git add/commit for coordinator/supervisor capabilities while keeping git push blocked
- Block Claude Code native team/task tools (Task, TeamCreate, etc.) for all overstory agents — enforces overstory sling delegation
- ZFC principle: tmux liveness as primary signal, pid check as secondary, sessions.json as tertiary
- Descendant tree walking for process cleanup —
getPanePid(),getDescendantPids(),killProcessTree()with SIGTERM → grace → SIGKILL - Re-check zombies on every tick, handle investigate action
- Stalled state added to zombie reconciliation
- Builder agents send
worker_donemail on task completion - Overlay quality gates include worker_done signal step
- Prime activation context injection for bound tasks
MISSING_WORKER_DONEfailure mode in builder definition
- Switch sling from headless (
claude -p) to interactive mode with tmux sendKeys beacon — hooks now fire, enabling mail, metrics, logs, and lastActivity updates - Structured
buildBeacon()with identity context and startup protocol - Fix beacon sendKeys multiline bug (increase initial sleep, follow-up Enter after 500ms)
--verboseflag foroverstory status--jsonflag foroverstory sling--backgroundflag foroverstory watch- Help text for unknown subcommands
SUPPORTED_CAPABILITIESconstant andCapabilitytype
overstory initnow deploys agent definitions (copiesagents/*.mdto.overstory/agent-defs/) viaimport.meta.dirresolution- E2E lifecycle test validates full init → config → manifest → overlay pipeline on throwaway external projects
- Colocated tests with source files (moved from
__tests__/tosrc/) - Shared test harness:
createTempGitRepo(),cleanupTempDir(),commitFile()insrc/test-helpers.ts - Replaced
Bun.spawnmocks with real implementations in 3 test files - Optimized test harness: 38.1s → 11.7s (-69%)
- Comprehensive metrics command test coverage
- E2E init-sling lifecycle test
- Test suite grew from initial release to 515 tests across 24 files (1286 expect() calls)
- 60+ bugs resolved across 8 dedicated fix sessions, covering P1 criticals through P4 backlog items:
- Hooks enforcement: tool guard sed patterns now handle optional space after JSON colons
- Status display: filter completed sessions from active agent count
- Session lifecycle: move session recording before beacon send to fix booting → working race condition
- Stagger delay (
staggerDelayMs) now actually enforced between agent spawns - Hardcoded
mainbranch replaced with dynamic branch detection in worktree/manager and merge/resolver - Sling headless mode fixes for E2E validation
- Input validation, environment variable handling, init improvements, cleanup lifecycle
.gitignorepatterns for.overstory/artifacts- Mail, merge, and worktree subsystem edge cases
- Agent propulsion principle: failure modes, cost awareness, and completion protocol added to all agent definitions
- Agent quality gates updated across all base definitions
- Test file paths updated from
__tests__/convention to colocatedsrc/**/*.test.ts
0.1.0 - 2026-02-12
- CLI entry point with command router (
overstory <command>) overstory init— initialize.overstory/in a target projectoverstory sling— spawn worker agents in git worktrees via tmuxoverstory prime— load context for orchestrator or agent sessionsoverstory status— show active agents, worktrees, and project stateoverstory mail— SQLite-based inter-agent messaging (send/check/list/read/reply)overstory merge— merge agent branches with 4-tier conflict resolutionoverstory worktree— manage git worktrees (list/clean)overstory log— hook event logging (NDJSON + human-readable)overstory watch— watchdog daemon with health monitoring and AI-assisted triageoverstory metrics— session metrics storage and reporting- Agent manifest system with 5 base agent types (scout, builder, reviewer, lead, merger)
- Two-layer agent definition: base
.mdfiles (HOW) + dynamic overlays (WHAT) - Persistent agent identity and CV system
- Hooks deployer for automatic worktree configuration
- beads (
bd) CLI wrapper for issue tracking integration - mulch CLI wrapper for structured expertise management
- Multi-format logging with secret redaction
- SQLite metrics storage for session analytics
- Full test suite using
bun test - Biome configuration for formatting and linting
- TypeScript strict mode with
noUncheckedIndexedAccess