feat: subagents β on-demand + scheduled spawning with lifecycle management (#15)#44
Merged
Conversation
eec78b6 to
c8ab029
Compare
Add the subagent core (issue #15): a scoped sub-loop the agent delegates to, reached by one primitive (AgentCore.run_subagent) from two paths β the spawn_subagent tool and scheduled 'subagent' jobs. - core/subagents.py: SubagentRun + in-memory SubagentRegistry (list/status/ cancel) and narrow_scope() for inherit-never-widen tool/skill/secret scope. - agent.py: spawn_subagent tool, run_subagent primitive, budgeted/depth-capped sub-loop (system semantics: no decomposition/memory/reflection/prompts), background runs that post results back to the originating chat. - config: SubagentsConfig (enabled, recursion_depth, max_steps, token_budget, max_concurrent). - job_store: 'subagent' type + persona column (additive migration). - scheduler: run_subagent_task handler + routing for scheduled subagent jobs. - permissions: spawn_subagent is an ASK write-action. - admin: spawn_subagent is persona-gateable.
β¦job form - Telegram: /jobs lists active subagent runs with inline Cancel buttons; the command is registered ahead of the text handler so it isn't sent to the agent. - Admin: Jobs tab gains a responsive subagent-runs card grid (status, persona, elapsed, progress, cancel) polled every 3s; /partials/subagent-runs + /subagents/cancel routes. - Admin Jobs form: 'subagent' job type + persona field, validated and persisted.
β¦jobs 20 tests: narrow_scope inherit-never-widen, SubagentRegistry lifecycle/trim, run_subagent (sync result, disabled, depth cap, unknown persona, step-budget stop, backgroundβorigin delivery, concurrency cap), persona narrowing, JobStore persona persistence + column migration, and run_subagent_task delivery.
β¦ table
- docs/content/docs/subagents.mdx + nav entry; scheduler.mdx gains the
'subagent' job type row.
- README: Subagents feature bullet.
- config.yml.example: subagents{} block + commented scheduled subagent job.
- Idempotent SubagentRegistry.finish(): terminal states are sticky and finish() returns whether it transitioned, so a last-moment normal completion can't un-cancel a run or double-deliver its result. - spawn_subagent exempt from the same-turn write dedup guard (like manage_jobs): each spawn is a distinct run, so fan-out of an identical task is allowed. - Admin 'Run now' handles 'subagent' jobs (was: 'Unknown job type'); persona is only persisted for subagent jobs. - Scheduler warns instead of silently dropping a subagent result with no owner. - subagent_runs.html: 'done' shows a green badge; Jobs form has a subagent task hint. - Drop unused SubagentRun.parent_id / to_dict() (YAGNI). - Tests: token-budget stop, dedup exemption, finish() idempotence (473 total).
Mirrors the Web artifacts card: the admin Tools tab gets a 'Subagents' card β enable toggle + recursion depth / max steps / token budget / max concurrent, saved via PATCH /config and applied live (no restart). Disabling removes the capability everywhere, not just at call time: - apply_feature_gates already drops spawn_subagent from the advertised tools. - gateable_tools_for() now hides the spawn_subagent checkbox in the Personae editor when subagents are off (persisted persona scope is preserved). Tests mirror the artifacts gating tests; docs name the Tools tab as the home.
A background subagent's result is delivered to the chat out-of-band (a direct ch.send), so it never entered the spawning agent's context β the agent couldn't tell a finished run from a pending one and would confidently claim runs were 'still running' (it even reached for manage_jobs, which only knows scheduled jobs, not the in-memory subagent registry). Fix: each turn, inject a <background_subagents> block into the user-message preamble listing this chat's runs β running ones every turn, a finished one once with its result summary (SubagentRegistry.updates_for, gated to the chat's channel+chat_id). No new tool, no history-alternation hazard; the preamble is sent to the model but not persisted. The background-spawn return note now tells the agent results auto-post and it needn't relay them. Tests: updates_for chat-scoping + report-finish-once; _subagent_status_note.
β¦st out-of-band A background subagent's result was only ch.send to the chat β the spawning agent never saw it, so it couldn't reason about or recall it. Now the result is ALSO recorded into the originating chat's history as an assistant turn, so it becomes a first-class part of the conversation the agent reads on every later turn (and the agent's memory matches what the user saw). - history: append_to_last_turn / append_to_last_session_message merge the result into the trailing assistant turn, preserving strict user/assistant alternation for providers that require it (both injection and session modes); a fresh assistant turn is added only when the last turn isn't the assistant's. - _deliver_subagent_result now delivers AND records. - The turn preamble now lists only *running* runs (status awareness while pending); finished runs live in history instead of an ephemeral once-only note. Tests cover the merge (alternation kept), history persistence, and running-only preamble. 481 passing.
The spawning agent now sizes each subagent run to the job and gets its files back: - spawn_subagent gains max_steps, token_budget, and thinking_effort. max_steps/token_budget default to the configured value and are clamped to it as a ceiling (resolve_cap) β the agent may dial a run *down* but never past the guardrail; token_budget has a 1000 floor. thinking_effort (off|low|medium|high) maps to a thinking level, omit to inherit the caller's (normalize_effort); an effort-scoped _background_llm clone runs the loop. - Subagents share the agent's cwd/filesystem, so any file they write is already on disk where the parent can reach it. FILE_HANDOFF_INSTRUCTION makes the subagent report absolute paths in its result, which the existing result-folding carries into the parent's history. - Persona stays inherited by default (run as the caller itself); made explicit in the tool schema and docs. resolve_cap degrades non-numeric / infinite input to the ceiling rather than raising. Tests cover clamping, effort mapping/scoping, the file- handoff system suffix, and the caller-as-limiter path.
β¦ormed Selection stays user-led β omitting `persona` runs the subagent as the caller itself (the chat's bound persona). For the specialist case the agent no longer guesses a name blindly: - _personae_roster_block injects a compact `name β role` roster of available personae into the main turn preamble, gated to when spawn_subagent is actually in scope (subagents enabled + the persona's tool allowlist permits it). Off for subagent sub-loops (offer_personae defaults False), so it never leaks where it can't be used. - The "Persona not found" error now lists the valid names, so a wrong guess self-corrects. Tests cover the name/role rendering (first role line only), the current- persona "(you)" tag, the disabled / out-of-scope gates, main-turn-only injection, and the name-listing error.
c8ab029 to
c791fcd
Compare
Owner
Author
|
Rebased onto latest
512 tests pass, ruff clean. Each change was adversarially reviewed. |
β¦ to the user Real-world feedback exposed three problems with background subagents: they picked a random persona, dumped long raw output straight to the user, and the agent never processed the results into a real answer. Fixes: - Synthesis flow: a background subagent now works for the AGENT, not the user. Its raw result is no longer sent to the chat. When the whole batch of a chat's background runs has finished, the agent runs ONE synthesis turn (process(decompose=False)) that ingests the findings and replies in its own voice β the user sees only that. A barrier collapses parallel spawns into a single reply. Replaces the old _deliver_subagent_result / direct-to-user path. - Lost-reply fix: cancellation is the one terminal path that never re-checked the barrier, so a done sibling that deferred to a still-running run got orphaned when the user cancelled that run. The cancel path now releases the deferred batch. Regression test added. - Persona default: spawn_subagent + the roster now state firmly that omitting 'persona' (run as yourself) is the default; a persona is named only on an explicit request. Stops the agent assigning an unrelated specialist. - Conciseness: subagents are told their result is read by the agent, not a human β return dense facts, no prose/tables. Adds SubagentRun.synthesized; process() gains decompose. Docs updated.
The synthesis turn runs the full agent loop, which still offered spawn_subagent β so a misbehaving model could chain new background spawns during synthesis, each triggering another synthesis turn (unbounded, behind only a prompt instruction). process() gains allow_subagents (default True); the synthesis turn passes False, which structurally drops spawn_subagent from its tools and omits the persona roster. Test asserts the tool is withheld.
Replace the per-result raw delivery + ad-hoc synthesis turn with a dedicated summary inference, and stop polluting the user chat and the agent context with raw subagent output. - New subagent_summary inference (enabled/provider/model/thinking_level, mirroring memory/compaction/reflection). When a chat's background batch finishes, it distils the results into a one-line chat *notification* and a concise *digest* kept in the agent's context for follow-ups. Disabled or on failure β crude first-line truncation (short_summary). Default model is the fast/cheap deepseek-v4-flash. - Drops the full process()-based synthesis turn (and the now-dead decompose / allow_subagents params); the notification is what the user sees, the raw result stays only in the ephemeral run registry. - Admin: Result-summary controls in the Tools-tab Subagents card. - Docs/README/config example updated. Also switch the text background inferences (memory extraction/consolidation, goal decomposition, task reflection, compaction) from claude-haiku-4-5 to deepseek-v4-flash by default β better and cheaper β including the admin-UI fallbacks and llm.html placeholders. Vision stays on a multimodal model.
Owner
Author
|
Pushed three more commits acting on real-world feedback:
515 tests pass, ruff clean. |
mattmezza
added a commit
that referenced
this pull request
Jun 29, 2026
Rebased onto main after #44 landed. apply_feature_gates now also gates the skill-discovery tools (skills_on_demand, default False alongside subagents_enabled), and the subagent loop builds its tool set through the shared _tools_for_turn so a subagent in on-demand mode gets search_skills/list_skills to match the pointer its preamble already carries. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements issue #15 β Subagents: on-demand spawning, scheduling, and lifecycle management.
One execution primitive (
AgentCore.run_subagent) reached by two trigger paths:spawn_subagenttool, called by the agent mid-turn.subagentjob fired by the scheduler.A subagent runs the existing agent loop with system semantics (no goal
decomposition, memory extraction, reflection, or per-action approval prompts) β
the same path scheduled jobs already use β under a chosen persona, and returns a
structured result.
Execution model
{ summary, result }; the parent continues in the same turn.background: true): returns a run id immediately; the run executes off-turn and posts its result back to the originating chat.Guardrails (config
subagents)narrow_scope).NEVERpermission rules still apply at any depth.Lifecycle (list / status / cancel)
/jobslists active runs with inline Cancel buttons β monitor/cancel a long task from the phone, no web UI needed.SubagentRegistry(ephemeral by design β a restart clears them).Acceptance criteria
Changes
core/subagents.pyβSubagentRun,SubagentRegistry,narrow_scope.core/agent.pyβspawn_subagenttool,run_subagentprimitive, budgeted/depth-capped sub-loop, background delivery, request-state plumbing (depth/origin/persona).core/config.pyβSubagentsConfig;SchedulerJob.persona.core/job_store.pyβsubagenttype +personacolumn (additive migration).core/scheduler.pyβrun_subagent_taskhandler + routing.core/permissions.pyβspawn_subagentis an ASK write-action.channels/telegram.pyβ/jobscommand + cancel callback.api/admin.py+ templates β subagent runs view, cancel route,subagentjob type in the Jobs form, gateable per persona.subagents.mdx(+ nav), scheduler table row, README feature,config.yml.example.Tests
tests/test_subagents.py(20 new; 470 total green): scope narrowing, registry lifecycle/trim,run_subagent(sync, disabled, depth cap, unknown persona, step-budget stop, backgroundβorigin delivery, concurrency cap), persona narrowing, JobStore persona persistence + column migration, scheduled-run delivery.Notes / deliberate simplifications
channel="system", so writes inside a run are auto-approved β the trust boundary is the single ASK onspawn_subagent(same model as scheduled jobs). Scope is narrowed andNEVERrules still block.{ summary }is a truncated preview of the result, not a separate LLM call.Closes #15