Skip to content

feat: subagents β€” on-demand + scheduled spawning with lifecycle management (#15)#44

Merged
mattmezza merged 13 commits into
mainfrom
feat/subagents
Jun 29, 2026
Merged

feat: subagents β€” on-demand + scheduled spawning with lifecycle management (#15)#44
mattmezza merged 13 commits into
mainfrom
feat/subagents

Conversation

@mattmezza

Copy link
Copy Markdown
Owner

Summary

Implements issue #15 β€” Subagents: on-demand spawning, scheduling, and lifecycle management.

One execution primitive (AgentCore.run_subagent) reached by two trigger paths:

  • On demand β€” the new spawn_subagent tool, called by the agent mid-turn.
  • Scheduled β€” a subagent job fired by the scheduler.

A subagent runs the existing agent loop with system semantics (no goal
decomposition, memory extraction, reflection, or per-action approval prompts) β€”
the same path scheduled jobs already use β€” under a chosen persona, and returns a
structured result.

Execution model

  • Sync (default): the tool blocks, runs the sub-loop to completion, returns { summary, result }; the parent continues in the same turn.
  • Background (background: true): returns a run id immediately; the run executes off-turn and posts its result back to the originating chat.

Guardrails (config subagents)

  • Inherit-never-widen scope: a child's tools/skills/secrets are the intersection of the caller's scope and the requested persona's (narrow_scope). NEVER permission rules still apply at any depth.
  • Recursion depth cap (default 3).
  • Per-run step + token budgets (default 12 steps / 100k tokens) β€” a hard stop on the otherwise-unbounded loop.
  • Max concurrent background runs (default 3).

Lifecycle (list / status / cancel)

  • Telegram: /jobs lists active runs with inline Cancel buttons β€” monitor/cancel a long task from the phone, no web UI needed.
  • Admin UI: the Jobs tab gains a responsive subagent-runs card grid (persona, status, elapsed, current step, cancel), polled live; collapses to one column at phone width.
  • Runs are tracked in an in-memory SubagentRegistry (ephemeral by design β€” a restart clears them).

Acceptance criteria

  • The agent can spawn a subagent with a chosen persona and receive a structured result.
  • Background spawns report back asynchronously to the correct chat.
  • A background subagent can be monitored and cancelled from Telegram on mobile; the jobs view is usable at phone width.
  • Recursion depth and budgets are enforced; scope cannot be widened by a child.

Changes

  • core/subagents.py β€” SubagentRun, SubagentRegistry, narrow_scope.
  • core/agent.py β€” spawn_subagent tool, run_subagent primitive, budgeted/depth-capped sub-loop, background delivery, request-state plumbing (depth/origin/persona).
  • core/config.py β€” SubagentsConfig; SchedulerJob.persona.
  • core/job_store.py β€” subagent type + persona column (additive migration).
  • core/scheduler.py β€” run_subagent_task handler + routing.
  • core/permissions.py β€” spawn_subagent is an ASK write-action.
  • channels/telegram.py β€” /jobs command + cancel callback.
  • api/admin.py + templates β€” subagent runs view, cancel route, subagent job type in the Jobs form, gateable per persona.
  • Docs: subagents.mdx (+ nav), scheduler table row, README feature, config.yml.example.

Tests

tests/test_subagents.py (20 new; 470 total green): scope narrowing, registry lifecycle/trim, run_subagent (sync, disabled, depth cap, unknown persona, step-budget stop, background→origin delivery, concurrency cap), persona narrowing, JobStore persona persistence + column migration, scheduled-run delivery.

Notes / deliberate simplifications

  • Subagents use channel="system", so writes inside a run are auto-approved β€” the trust boundary is the single ASK on spawn_subagent (same model as scheduled jobs). Scope is narrowed and NEVER rules still block.
  • { summary } is a truncated preview of the result, not a separate LLM call.
  • The runs registry is in-memory (ephemeral); background results post back on completion (start/result messages rather than an in-place edited "working…" message).

Closes #15

@mattmezza mattmezza force-pushed the feat/subagents branch 2 times, most recently from eec78b6 to c8ab029 Compare June 28, 2026 21:36
mattmezza added 10 commits June 29, 2026 00:05
Add the subagent core (issue #15): a scoped sub-loop the agent delegates to,
reached by one primitive (AgentCore.run_subagent) from two paths β€” the
spawn_subagent tool and scheduled 'subagent' jobs.

- core/subagents.py: SubagentRun + in-memory SubagentRegistry (list/status/
  cancel) and narrow_scope() for inherit-never-widen tool/skill/secret scope.
- agent.py: spawn_subagent tool, run_subagent primitive, budgeted/depth-capped
  sub-loop (system semantics: no decomposition/memory/reflection/prompts),
  background runs that post results back to the originating chat.
- config: SubagentsConfig (enabled, recursion_depth, max_steps, token_budget,
  max_concurrent).
- job_store: 'subagent' type + persona column (additive migration).
- scheduler: run_subagent_task handler + routing for scheduled subagent jobs.
- permissions: spawn_subagent is an ASK write-action.
- admin: spawn_subagent is persona-gateable.
…job form

- Telegram: /jobs lists active subagent runs with inline Cancel buttons; the
  command is registered ahead of the text handler so it isn't sent to the agent.
- Admin: Jobs tab gains a responsive subagent-runs card grid (status, persona,
  elapsed, progress, cancel) polled every 3s; /partials/subagent-runs +
  /subagents/cancel routes.
- Admin Jobs form: 'subagent' job type + persona field, validated and persisted.
…jobs

20 tests: narrow_scope inherit-never-widen, SubagentRegistry lifecycle/trim,
run_subagent (sync result, disabled, depth cap, unknown persona, step-budget
stop, background→origin delivery, concurrency cap), persona narrowing, JobStore
persona persistence + column migration, and run_subagent_task delivery.
… table

- docs/content/docs/subagents.mdx + nav entry; scheduler.mdx gains the
  'subagent' job type row.
- README: Subagents feature bullet.
- config.yml.example: subagents{} block + commented scheduled subagent job.
- Idempotent SubagentRegistry.finish(): terminal states are sticky and finish()
  returns whether it transitioned, so a last-moment normal completion can't
  un-cancel a run or double-deliver its result.
- spawn_subagent exempt from the same-turn write dedup guard (like manage_jobs):
  each spawn is a distinct run, so fan-out of an identical task is allowed.
- Admin 'Run now' handles 'subagent' jobs (was: 'Unknown job type'); persona is
  only persisted for subagent jobs.
- Scheduler warns instead of silently dropping a subagent result with no owner.
- subagent_runs.html: 'done' shows a green badge; Jobs form has a subagent task
  hint.
- Drop unused SubagentRun.parent_id / to_dict() (YAGNI).
- Tests: token-budget stop, dedup exemption, finish() idempotence (473 total).
Mirrors the Web artifacts card: the admin Tools tab gets a 'Subagents' card β€”
enable toggle + recursion depth / max steps / token budget / max concurrent,
saved via PATCH /config and applied live (no restart).

Disabling removes the capability everywhere, not just at call time:
- apply_feature_gates already drops spawn_subagent from the advertised tools.
- gateable_tools_for() now hides the spawn_subagent checkbox in the Personae
  editor when subagents are off (persisted persona scope is preserved).
Tests mirror the artifacts gating tests; docs name the Tools tab as the home.
A background subagent's result is delivered to the chat out-of-band (a direct
ch.send), so it never entered the spawning agent's context β€” the agent couldn't
tell a finished run from a pending one and would confidently claim runs were
'still running' (it even reached for manage_jobs, which only knows scheduled
jobs, not the in-memory subagent registry).

Fix: each turn, inject a <background_subagents> block into the user-message
preamble listing this chat's runs β€” running ones every turn, a finished one
once with its result summary (SubagentRegistry.updates_for, gated to the
chat's channel+chat_id). No new tool, no history-alternation hazard; the
preamble is sent to the model but not persisted. The background-spawn return
note now tells the agent results auto-post and it needn't relay them.

Tests: updates_for chat-scoping + report-finish-once; _subagent_status_note.
…st out-of-band

A background subagent's result was only ch.send to the chat β€” the spawning
agent never saw it, so it couldn't reason about or recall it. Now the result is
ALSO recorded into the originating chat's history as an assistant turn, so it
becomes a first-class part of the conversation the agent reads on every later
turn (and the agent's memory matches what the user saw).

- history: append_to_last_turn / append_to_last_session_message merge the
  result into the trailing assistant turn, preserving strict user/assistant
  alternation for providers that require it (both injection and session modes);
  a fresh assistant turn is added only when the last turn isn't the assistant's.
- _deliver_subagent_result now delivers AND records.
- The turn preamble now lists only *running* runs (status awareness while
  pending); finished runs live in history instead of an ephemeral once-only note.
Tests cover the merge (alternation kept), history persistence, and running-only
preamble. 481 passing.
The spawning agent now sizes each subagent run to the job and gets its
files back:

- spawn_subagent gains max_steps, token_budget, and thinking_effort.
  max_steps/token_budget default to the configured value and are clamped
  to it as a ceiling (resolve_cap) β€” the agent may dial a run *down* but
  never past the guardrail; token_budget has a 1000 floor. thinking_effort
  (off|low|medium|high) maps to a thinking level, omit to inherit the
  caller's (normalize_effort); an effort-scoped _background_llm clone runs
  the loop.
- Subagents share the agent's cwd/filesystem, so any file they write is
  already on disk where the parent can reach it. FILE_HANDOFF_INSTRUCTION
  makes the subagent report absolute paths in its result, which the
  existing result-folding carries into the parent's history.
- Persona stays inherited by default (run as the caller itself); made
  explicit in the tool schema and docs.

resolve_cap degrades non-numeric / infinite input to the ceiling rather
than raising. Tests cover clamping, effort mapping/scoping, the file-
handoff system suffix, and the caller-as-limiter path.
…ormed

Selection stays user-led β€” omitting `persona` runs the subagent as the
caller itself (the chat's bound persona). For the specialist case the
agent no longer guesses a name blindly:

- _personae_roster_block injects a compact `name β€” role` roster of
  available personae into the main turn preamble, gated to when
  spawn_subagent is actually in scope (subagents enabled + the persona's
  tool allowlist permits it). Off for subagent sub-loops (offer_personae
  defaults False), so it never leaks where it can't be used.
- The "Persona not found" error now lists the valid names, so a wrong
  guess self-corrects.

Tests cover the name/role rendering (first role line only), the current-
persona "(you)" tag, the disabled / out-of-scope gates, main-turn-only
injection, and the name-listing error.
@mattmezza

Copy link
Copy Markdown
Owner Author

Rebased onto latest main (per-turn skills, #46/#48). Three additions since the last review:

  • Caller-sized runs β€” spawn_subagent gains max_steps, token_budget, and thinking_effort. The two numeric caps default to the configured value and are clamped to it as a ceiling (the agent can dial a run down but never past the guardrail; token_budget floored at 1000); thinking_effort (off|low|medium|high) maps to a thinking level, omit to inherit the caller's.
  • File handoff β€” subagents share the agent's cwd/filesystem, so a subagent now reports the absolute paths of any files it writes in its result, which the existing result-folding carries into the parent's history.
  • Persona roster β€” the main turn shows the agent a compact name β€” role roster so delegating to a specialist is informed, not a guess; selection stays user-led (omit persona = run as yourself). Gated to when spawn_subagent is in scope. The "Persona not found" error now lists valid names.

512 tests pass, ruff clean. Each change was adversarially reviewed.

… to the user

Real-world feedback exposed three problems with background subagents: they
picked a random persona, dumped long raw output straight to the user, and the
agent never processed the results into a real answer. Fixes:

- Synthesis flow: a background subagent now works for the AGENT, not the user.
  Its raw result is no longer sent to the chat. When the whole batch of a
  chat's background runs has finished, the agent runs ONE synthesis turn
  (process(decompose=False)) that ingests the findings and replies in its own
  voice β€” the user sees only that. A barrier collapses parallel spawns into a
  single reply. Replaces the old _deliver_subagent_result / direct-to-user path.

- Lost-reply fix: cancellation is the one terminal path that never re-checked
  the barrier, so a done sibling that deferred to a still-running run got
  orphaned when the user cancelled that run. The cancel path now releases the
  deferred batch. Regression test added.

- Persona default: spawn_subagent + the roster now state firmly that omitting
  'persona' (run as yourself) is the default; a persona is named only on an
  explicit request. Stops the agent assigning an unrelated specialist.

- Conciseness: subagents are told their result is read by the agent, not a
  human β€” return dense facts, no prose/tables.

Adds SubagentRun.synthesized; process() gains decompose. Docs updated.
The synthesis turn runs the full agent loop, which still offered
spawn_subagent β€” so a misbehaving model could chain new background spawns
during synthesis, each triggering another synthesis turn (unbounded, behind
only a prompt instruction). process() gains allow_subagents (default True);
the synthesis turn passes False, which structurally drops spawn_subagent from
its tools and omits the persona roster. Test asserts the tool is withheld.
Replace the per-result raw delivery + ad-hoc synthesis turn with a dedicated
summary inference, and stop polluting the user chat and the agent context with
raw subagent output.

- New subagent_summary inference (enabled/provider/model/thinking_level,
  mirroring memory/compaction/reflection). When a chat's background batch
  finishes, it distils the results into a one-line chat *notification* and a
  concise *digest* kept in the agent's context for follow-ups. Disabled or on
  failure β†’ crude first-line truncation (short_summary). Default model is the
  fast/cheap deepseek-v4-flash.
- Drops the full process()-based synthesis turn (and the now-dead decompose /
  allow_subagents params); the notification is what the user sees, the raw
  result stays only in the ephemeral run registry.
- Admin: Result-summary controls in the Tools-tab Subagents card.
- Docs/README/config example updated.

Also switch the text background inferences (memory extraction/consolidation,
goal decomposition, task reflection, compaction) from claude-haiku-4-5 to
deepseek-v4-flash by default β€” better and cheaper β€” including the admin-UI
fallbacks and llm.html placeholders. Vision stays on a multimodal model.
@mattmezza

Copy link
Copy Markdown
Owner Author

Pushed three more commits acting on real-world feedback:

  • Hide raw from chat + synthesise β†’ then refined: a finished background batch is now distilled by a dedicated subagent_summary inference into a one-line chat notification + a concise digest kept in the agent context. Raw subagent output reaches neither the user nor the context. Disabled/error β†’ first-line truncation fallback. A barrier collapses parallel spawns into one delivery; cancelling a sibling no longer orphans a deferred reply.
  • Persona default hardened (omit = run as yourself; only name a specialist on explicit request).
  • Inference defaults (memory extraction/consolidation, goal decomposition, task reflection, compaction) switched from claude-haiku-4-5 to deepseek-v4-flash β€” better/cheaper and consistent with the main provider. Vision left on a multimodal model.
  • Admin UI (Result-summary controls in the Subagents card), docs, README, and config example updated.

515 tests pass, ruff clean.

@mattmezza mattmezza merged commit 4257fb1 into main Jun 29, 2026
1 check passed
@mattmezza mattmezza deleted the feat/subagents branch June 29, 2026 11:56
mattmezza added a commit that referenced this pull request Jun 29, 2026
Rebased onto main after #44 landed. apply_feature_gates now also gates
the skill-discovery tools (skills_on_demand, default False alongside
subagents_enabled), and the subagent loop builds its tool set through
the shared _tools_for_turn so a subagent in on-demand mode gets
search_skills/list_skills to match the pointer its preamble already
carries.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subagents: on-demand spawning, scheduling, and lifecycle management

1 participant