feat: subagents — on-demand + scheduled spawning with lifecycle management (#15) by mattmezza · Pull Request #44 · mattmezza/mpa

mattmezza · 2026-06-27T23:00:47Z

Summary

Implements issue #15 — Subagents: on-demand spawning, scheduling, and lifecycle management.

One execution primitive (AgentCore.run_subagent) reached by two trigger paths:

On demand — the new spawn_subagent tool, called by the agent mid-turn.
Scheduled — a subagent job fired by the scheduler.

A subagent runs the existing agent loop with system semantics (no goal
decomposition, memory extraction, reflection, or per-action approval prompts) —
the same path scheduled jobs already use — under a chosen persona, and returns a
structured result.

Execution model

Sync (default): the tool blocks, runs the sub-loop to completion, returns { summary, result }; the parent continues in the same turn.
Background (background: true): returns a run id immediately; the run executes off-turn and posts its result back to the originating chat.

Guardrails (config `subagents`)

Inherit-never-widen scope: a child's tools/skills/secrets are the intersection of the caller's scope and the requested persona's (narrow_scope). NEVER permission rules still apply at any depth.
Recursion depth cap (default 3).
Per-run step + token budgets (default 12 steps / 100k tokens) — a hard stop on the otherwise-unbounded loop.
Max concurrent background runs (default 3).

Lifecycle (list / status / cancel)

Telegram: /jobs lists active runs with inline Cancel buttons — monitor/cancel a long task from the phone, no web UI needed.
Admin UI: the Jobs tab gains a responsive subagent-runs card grid (persona, status, elapsed, current step, cancel), polled live; collapses to one column at phone width.
Runs are tracked in an in-memory SubagentRegistry (ephemeral by design — a restart clears them).

Acceptance criteria

The agent can spawn a subagent with a chosen persona and receive a structured result.
Background spawns report back asynchronously to the correct chat.
A background subagent can be monitored and cancelled from Telegram on mobile; the jobs view is usable at phone width.
Recursion depth and budgets are enforced; scope cannot be widened by a child.

Changes

core/subagents.py — SubagentRun, SubagentRegistry, narrow_scope.
core/agent.py — spawn_subagent tool, run_subagent primitive, budgeted/depth-capped sub-loop, background delivery, request-state plumbing (depth/origin/persona).
core/config.py — SubagentsConfig; SchedulerJob.persona.
core/job_store.py — subagent type + persona column (additive migration).
core/scheduler.py — run_subagent_task handler + routing.
core/permissions.py — spawn_subagent is an ASK write-action.
channels/telegram.py — /jobs command + cancel callback.
api/admin.py + templates — subagent runs view, cancel route, subagent job type in the Jobs form, gateable per persona.
Docs: subagents.mdx (+ nav), scheduler table row, README feature, config.yml.example.

Tests

tests/test_subagents.py (20 new; 470 total green): scope narrowing, registry lifecycle/trim, run_subagent (sync, disabled, depth cap, unknown persona, step-budget stop, background→origin delivery, concurrency cap), persona narrowing, JobStore persona persistence + column migration, scheduled-run delivery.

Notes / deliberate simplifications

Subagents use channel="system", so writes inside a run are auto-approved — the trust boundary is the single ASK on spawn_subagent (same model as scheduled jobs). Scope is narrowed and NEVER rules still block.
{ summary } is a truncated preview of the result, not a separate LLM call.
The runs registry is in-memory (ephemeral); background results post back on completion (start/result messages rather than an in-place edited "working…" message).

Closes #15

Add the subagent core (issue #15): a scoped sub-loop the agent delegates to, reached by one primitive (AgentCore.run_subagent) from two paths — the spawn_subagent tool and scheduled 'subagent' jobs. - core/subagents.py: SubagentRun + in-memory SubagentRegistry (list/status/ cancel) and narrow_scope() for inherit-never-widen tool/skill/secret scope. - agent.py: spawn_subagent tool, run_subagent primitive, budgeted/depth-capped sub-loop (system semantics: no decomposition/memory/reflection/prompts), background runs that post results back to the originating chat. - config: SubagentsConfig (enabled, recursion_depth, max_steps, token_budget, max_concurrent). - job_store: 'subagent' type + persona column (additive migration). - scheduler: run_subagent_task handler + routing for scheduled subagent jobs. - permissions: spawn_subagent is an ASK write-action. - admin: spawn_subagent is persona-gateable.

…job form - Telegram: /jobs lists active subagent runs with inline Cancel buttons; the command is registered ahead of the text handler so it isn't sent to the agent. - Admin: Jobs tab gains a responsive subagent-runs card grid (status, persona, elapsed, progress, cancel) polled every 3s; /partials/subagent-runs + /subagents/cancel routes. - Admin Jobs form: 'subagent' job type + persona field, validated and persisted.

…jobs 20 tests: narrow_scope inherit-never-widen, SubagentRegistry lifecycle/trim, run_subagent (sync result, disabled, depth cap, unknown persona, step-budget stop, background→origin delivery, concurrency cap), persona narrowing, JobStore persona persistence + column migration, and run_subagent_task delivery.

… table - docs/content/docs/subagents.mdx + nav entry; scheduler.mdx gains the 'subagent' job type row. - README: Subagents feature bullet. - config.yml.example: subagents{} block + commented scheduled subagent job.

- Idempotent SubagentRegistry.finish(): terminal states are sticky and finish() returns whether it transitioned, so a last-moment normal completion can't un-cancel a run or double-deliver its result. - spawn_subagent exempt from the same-turn write dedup guard (like manage_jobs): each spawn is a distinct run, so fan-out of an identical task is allowed. - Admin 'Run now' handles 'subagent' jobs (was: 'Unknown job type'); persona is only persisted for subagent jobs. - Scheduler warns instead of silently dropping a subagent result with no owner. - subagent_runs.html: 'done' shows a green badge; Jobs form has a subagent task hint. - Drop unused SubagentRun.parent_id / to_dict() (YAGNI). - Tests: token-budget stop, dedup exemption, finish() idempotence (473 total).

Mirrors the Web artifacts card: the admin Tools tab gets a 'Subagents' card — enable toggle + recursion depth / max steps / token budget / max concurrent, saved via PATCH /config and applied live (no restart). Disabling removes the capability everywhere, not just at call time: - apply_feature_gates already drops spawn_subagent from the advertised tools. - gateable_tools_for() now hides the spawn_subagent checkbox in the Personae editor when subagents are off (persisted persona scope is preserved). Tests mirror the artifacts gating tests; docs name the Tools tab as the home.

A background subagent's result is delivered to the chat out-of-band (a direct ch.send), so it never entered the spawning agent's context — the agent couldn't tell a finished run from a pending one and would confidently claim runs were 'still running' (it even reached for manage_jobs, which only knows scheduled jobs, not the in-memory subagent registry). Fix: each turn, inject a <background_subagents> block into the user-message preamble listing this chat's runs — running ones every turn, a finished one once with its result summary (SubagentRegistry.updates_for, gated to the chat's channel+chat_id). No new tool, no history-alternation hazard; the preamble is sent to the model but not persisted. The background-spawn return note now tells the agent results auto-post and it needn't relay them. Tests: updates_for chat-scoping + report-finish-once; _subagent_status_note.

…st out-of-band A background subagent's result was only ch.send to the chat — the spawning agent never saw it, so it couldn't reason about or recall it. Now the result is ALSO recorded into the originating chat's history as an assistant turn, so it becomes a first-class part of the conversation the agent reads on every later turn (and the agent's memory matches what the user saw). - history: append_to_last_turn / append_to_last_session_message merge the result into the trailing assistant turn, preserving strict user/assistant alternation for providers that require it (both injection and session modes); a fresh assistant turn is added only when the last turn isn't the assistant's. - _deliver_subagent_result now delivers AND records. - The turn preamble now lists only *running* runs (status awareness while pending); finished runs live in history instead of an ephemeral once-only note. Tests cover the merge (alternation kept), history persistence, and running-only preamble. 481 passing.

The spawning agent now sizes each subagent run to the job and gets its files back: - spawn_subagent gains max_steps, token_budget, and thinking_effort. max_steps/token_budget default to the configured value and are clamped to it as a ceiling (resolve_cap) — the agent may dial a run *down* but never past the guardrail; token_budget has a 1000 floor. thinking_effort (off|low|medium|high) maps to a thinking level, omit to inherit the caller's (normalize_effort); an effort-scoped _background_llm clone runs the loop. - Subagents share the agent's cwd/filesystem, so any file they write is already on disk where the parent can reach it. FILE_HANDOFF_INSTRUCTION makes the subagent report absolute paths in its result, which the existing result-folding carries into the parent's history. - Persona stays inherited by default (run as the caller itself); made explicit in the tool schema and docs. resolve_cap degrades non-numeric / infinite input to the ceiling rather than raising. Tests cover clamping, effort mapping/scoping, the file- handoff system suffix, and the caller-as-limiter path.

…ormed Selection stays user-led — omitting `persona` runs the subagent as the caller itself (the chat's bound persona). For the specialist case the agent no longer guesses a name blindly: - _personae_roster_block injects a compact `name — role` roster of available personae into the main turn preamble, gated to when spawn_subagent is actually in scope (subagents enabled + the persona's tool allowlist permits it). Off for subagent sub-loops (offer_personae defaults False), so it never leaks where it can't be used. - The "Persona not found" error now lists the valid names, so a wrong guess self-corrects. Tests cover the name/role rendering (first role line only), the current- persona "(you)" tag, the disabled / out-of-scope gates, main-turn-only injection, and the name-listing error.

mattmezza · 2026-06-28T22:19:59Z

Rebased onto latest main (per-turn skills, #46/#48). Three additions since the last review:

Caller-sized runs — spawn_subagent gains max_steps, token_budget, and thinking_effort. The two numeric caps default to the configured value and are clamped to it as a ceiling (the agent can dial a run down but never past the guardrail; token_budget floored at 1000); thinking_effort (off|low|medium|high) maps to a thinking level, omit to inherit the caller's.
File handoff — subagents share the agent's cwd/filesystem, so a subagent now reports the absolute paths of any files it writes in its result, which the existing result-folding carries into the parent's history.
Persona roster — the main turn shows the agent a compact name — role roster so delegating to a specialist is informed, not a guess; selection stays user-led (omit persona = run as yourself). Gated to when spawn_subagent is in scope. The "Persona not found" error now lists valid names.

512 tests pass, ruff clean. Each change was adversarially reviewed.

… to the user Real-world feedback exposed three problems with background subagents: they picked a random persona, dumped long raw output straight to the user, and the agent never processed the results into a real answer. Fixes: - Synthesis flow: a background subagent now works for the AGENT, not the user. Its raw result is no longer sent to the chat. When the whole batch of a chat's background runs has finished, the agent runs ONE synthesis turn (process(decompose=False)) that ingests the findings and replies in its own voice — the user sees only that. A barrier collapses parallel spawns into a single reply. Replaces the old _deliver_subagent_result / direct-to-user path. - Lost-reply fix: cancellation is the one terminal path that never re-checked the barrier, so a done sibling that deferred to a still-running run got orphaned when the user cancelled that run. The cancel path now releases the deferred batch. Regression test added. - Persona default: spawn_subagent + the roster now state firmly that omitting 'persona' (run as yourself) is the default; a persona is named only on an explicit request. Stops the agent assigning an unrelated specialist. - Conciseness: subagents are told their result is read by the agent, not a human — return dense facts, no prose/tables. Adds SubagentRun.synthesized; process() gains decompose. Docs updated.

The synthesis turn runs the full agent loop, which still offered spawn_subagent — so a misbehaving model could chain new background spawns during synthesis, each triggering another synthesis turn (unbounded, behind only a prompt instruction). process() gains allow_subagents (default True); the synthesis turn passes False, which structurally drops spawn_subagent from its tools and omits the persona roster. Test asserts the tool is withheld.

Replace the per-result raw delivery + ad-hoc synthesis turn with a dedicated summary inference, and stop polluting the user chat and the agent context with raw subagent output. - New subagent_summary inference (enabled/provider/model/thinking_level, mirroring memory/compaction/reflection). When a chat's background batch finishes, it distils the results into a one-line chat *notification* and a concise *digest* kept in the agent's context for follow-ups. Disabled or on failure → crude first-line truncation (short_summary). Default model is the fast/cheap deepseek-v4-flash. - Drops the full process()-based synthesis turn (and the now-dead decompose / allow_subagents params); the notification is what the user sees, the raw result stays only in the ephemeral run registry. - Admin: Result-summary controls in the Tools-tab Subagents card. - Docs/README/config example updated. Also switch the text background inferences (memory extraction/consolidation, goal decomposition, task reflection, compaction) from claude-haiku-4-5 to deepseek-v4-flash by default — better and cheaper — including the admin-UI fallbacks and llm.html placeholders. Vision stays on a multimodal model.

mattmezza · 2026-06-29T09:17:57Z

Pushed three more commits acting on real-world feedback:

Hide raw from chat + synthesise → then refined: a finished background batch is now distilled by a dedicated subagent_summary inference into a one-line chat notification + a concise digest kept in the agent context. Raw subagent output reaches neither the user nor the context. Disabled/error → first-line truncation fallback. A barrier collapses parallel spawns into one delivery; cancelling a sibling no longer orphans a deferred reply.
Persona default hardened (omit = run as yourself; only name a specialist on explicit request).
Inference defaults (memory extraction/consolidation, goal decomposition, task reflection, compaction) switched from claude-haiku-4-5 to deepseek-v4-flash — better/cheaper and consistent with the main provider. Vision left on a multimodal model.
Admin UI (Result-summary controls in the Subagents card), docs, README, and config example updated.

515 tests pass, ruff clean.

Rebased onto main after #44 landed. apply_feature_gates now also gates the skill-discovery tools (skills_on_demand, default False alongside subagents_enabled), and the subagent loop builds its tool set through the shared _tools_for_turn so a subagent in on-demand mode gets search_skills/list_skills to match the pointer its preamble already carries. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

mattmezza force-pushed the feat/subagents branch 2 times, most recently from eec78b6 to c8ab029 Compare June 28, 2026 21:36

mattmezza added 10 commits June 29, 2026 00:05

mattmezza force-pushed the feat/subagents branch from c8ab029 to c791fcd Compare June 28, 2026 22:19

mattmezza added 3 commits June 29, 2026 08:53

mattmezza merged commit 4257fb1 into main Jun 29, 2026
1 check passed

mattmezza deleted the feat/subagents branch June 29, 2026 11:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: subagents — on-demand + scheduled spawning with lifecycle management (#15)#44

feat: subagents — on-demand + scheduled spawning with lifecycle management (#15)#44
mattmezza merged 13 commits into
mainfrom
feat/subagents

mattmezza commented Jun 27, 2026

Uh oh!

mattmezza commented Jun 28, 2026

Uh oh!

mattmezza commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattmezza commented Jun 27, 2026

Summary

Execution model

Guardrails (config subagents)

Lifecycle (list / status / cancel)

Acceptance criteria

Changes

Tests

Notes / deliberate simplifications

Uh oh!

mattmezza commented Jun 28, 2026

Uh oh!

mattmezza commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Guardrails (config `subagents`)