feat(hierarchy): General-purpose hierarchical behavior system for multi-tier agent orchestration#561
Draft
aspotton wants to merge 49 commits intospacedriveapp:mainfrom
Draft
Conversation
Add hierarchical multi-agent delegation: boss → planning-lead → builders. - TaskCreateTool registered in worker tool set for builder escalation - Boss agent preset (meta.toml, SOUL.md, IDENTITY.md, ROLE.md) - Planning-lead agent preset (meta.toml, SOUL.md, IDENTITY.md, ROLE.md) - Builder escalation prompt fragment with escalation_chain loop protection - README.md and agents.mdx documentation with config examples - Preset test assertion updated from 9 to 11 Escalation flow: blocked builder creates task for planning-lead via task_create; planning-lead resolves or escalates to boss via send_agent_message. Escalation chain metadata prevents infinite loops.
…ng-lead ROLE.md Add 'Agent Link Configuration' section as the first section in both ROLE.md files with the exact TOML config for the hierarchical link: from=boss-agent (superior), to=planning-lead (subordinate). Helps the cortex set up links correctly during the factory flow instead of reversing from/to direction.
Boss now checks org chart before acting on any request: 1. Check subordinates/peers in org context 2. Classify request as strategic vs execution 3. Delegate to matching subordinate via send_agent_message 4. Only handle directly if strategic or no subordinate is suited Prevents boss from executing work that subordinates should handle.
Planning-lead now checks org chart before acting on any request: 1. Check superior/subordinates/peers in org context 2. Classify request as coordination vs execution 3. Break into tasks and spawn builder workers for execution 4. Only handle directly if planning/coordination Prevents planning-lead from executing work that builders should handle.
…anning-lead ROLE.md Both agents now have: - Request Triage: check org chart before acting, delegate to subordinates - Task Completion Handling: relay results to users, re-delegate failures README.md expanded with task completion flow and agent behavior rules. Fixes: boss was executing work itself instead of delegating, and ignoring task completion notifications instead of relaying results to users.
Update Research Analyst, Project Manager, and Engineering Assistant presets
to support both standalone and hierarchical operation modes.
### Preset Updates
- **Research Analyst**: Added Request Triage, Analysis Execution, and Task
Completion Handling sections. Enables research delegation with evidence
reporting.
- **Project Manager**: Added Request Triage, Synthesis & Coordination, and
Task Completion Handling sections. Enables coordination across specialists
with synthesized status reporting.
- **Engineering Assistant**: Added Request Triage, Implementation Execution,
and Task Completion Handling sections. Enables implementation delegation
with test result reporting.
All presets now include Operating Modes documentation in IDENTITY.md and
personality updates in SOUL.md for hierarchical context awareness.
### Documentation
- **README.md**: Enhanced Boss Agent Hierarchy section with workflow details,
escalation flow diagrams, and task completion flow explanation.
- **agents.mdx**: Added comprehensive Specialized Agent Roles section
documenting all three agent types in both modes with behavioral rules.
- **Learnings**: Created .sisyphus/notepads/boss-agent-hierarchy/learnings.md
with dual-mode design patterns, key principles, and common pitfalls.
### Hierarchy Structure
Boss Agent → Planning Lead → [Research Analyst, Engineering Assistant, Project Manager]
Agents automatically triage requests based on source (user vs superior) and
adapt behavior accordingly, maintaining separation of concerns while enabling
recurring improvement workflows with institutional memory.
…le hierarchies Refactor Planning Lead to use capability-based delegation instead of hardcoded agent names, enabling support for any organizational hierarchy structure. ### Changes - **ROLE.md**: Added Capability-Based Delegation section with: - Discover Subordinates: Check org graph for available agents - Classify Tasks: Analysis, Implementation, or Coordination categories - Match by Tools: Delegate based on subordinate tooling (file read/write, shell, browser, task tools) - Fallback Strategy: Spawn builder workers when no suitable subordinate exists - Tool-Based Discovery: Verify capabilities before delegation - **IDENTITY.md**: Added Operating Modes and Capabilities sections: - Standalone Mode: No subordinates → spawn workers directly with required tools - Hierarchical Mode: Subordinates exist → delegate based on tool matching - Explicit tool-to-task mapping for Analysis, Implementation, and Coordination agents - **SOUL.md**: Updated opening to emphasize adaptive delegation based on available resources ### Benefits - **Flexibility**: Works with any hierarchy (standard, custom, or mixed) - **Standalone Compatibility**: Automatically spawns workers when no subordinates exist - **Scalability**: Adapts to new agent types without configuration changes - **Discovery-Driven**: Agents discover capabilities via org graph, not hardcoded names ### Backward Compatibility - Fully compatible with existing Boss Agent Hierarchy pattern - Maintains dual-mode operation (Standalone and Hierarchical) - No breaking changes to existing agent presets or configurations
Add optional DelegationConfig parameter to create_worker_tool_server. When present, includes send_agent_message tool for agent-to-agent delegation. Backward compatible: existing callers pass None, behavior unchanged.
Add task_metadata field to Worker struct to carry task metadata from the task store. Worker::run() uses this to detect delegated tasks (via delegated_by metadata) and enable send_agent_message tool.
Add standalone function to build org context for any agent from links, humans, and agent_names. Replicates channel.rs:build_org_context logic for use by cortex task pickup.
When cortex picks up a task with delegated_by metadata, inject identity files and org_context into the worker prompt. Pass task_metadata to Worker so it can enable send_agent_message tool for further delegation.
All 4 implementation commits compile cleanly. Pre-existing round_char_boundary errors (26) are unrelated to these changes. just/rustfmt unavailable in this environment — manual verification performed. Reviewer findings (P1-P3) assessed: - P1: ConversationLogger::new is a trivial pool wrapper — no perf impact - P2: Duplication explicitly required by plan — refactoring out of scope - P3: agent_id.clone() necessary due to memory_save_with_events ownership
…ker ambiguity Remove 'or can spawn workers' from Request Triage step 4. Subordinate delegation is now the ONLY option when subordinates exist. Worker spawning is a fallback only when no subordinate has required capabilities. This fixes the Planning Lead repeatedly spawning generic workers instead of delegating to the Engineering Assistant.
Planning Lead: - Add 'Environmental Blockers' section: handle sandbox/credential blockers gracefully instead of repeated escalation loops - Add 'No Status Check Tasks' section: stop spawning workers to check status of other workers — use task store directly Boss Agent: - Add 'Trust Your Subordinates' section: stop creating parallel unblock tasks when Planning Lead escalates — provide info or ask user instead Fixes task bounce loop where 15+ tasks were created for a single repo access issue due to repeated escalations and status checks.
… Analyst, Project Manager Add Environmental Blockers and No Status Check Tasks sections to all three remaining agents so the entire hierarchy behaves consistently: - Engineering Assistant: Stop spawning workers for analysis/synthesis, handle environmental blockers gracefully, no status check tasks - Research Analyst: Handle environmental blockers gracefully, no status check tasks - Project Manager: Handle environmental blockers gracefully, no status check tasks, removed 'or spawn workers' ambiguity from Request Triage Now all 5 agents (Boss, Planning Lead, Engineering Assistant, Research Analyst, Project Manager) have consistent anti-bounce behavior.
Fix the result relay problem where Planning Lead marks itself as 'done/blocked' before subordinate agents complete their delegated tasks. New rules: - DO NOT mark parent task done until all delegated subtasks are complete - Read and synthesize subordinate results from task store - Report synthesized results to superior before marking done - Delegating to a subordinate is NOT a blocker — it's the correct way to work This fixes the issue where Engineering Assistant's excellent repo analysis was lost because Planning Lead had already marked itself as 'blocked'.
Add the result relay rule to Boss Agent, Engineering Assistant, Research Analyst, and Project Manager (Planning Lead already had it). Prevents the pattern where a parent task marks itself 'done/blocked' before subordinate agents complete, losing their results. All 5 agents now consistently: - Wait for delegated subtasks to complete before marking done - Read and synthesize subordinate results - Report synthesized results to superior - Treat delegation as progress, not a blocker
The delegated worker prompt was putting identity BEFORE the worker template, causing the LLM to see 'You are a worker' AFTER 'You are the Planning Lead' and default to the generic worker behavior. Now identity + org_context are appended AFTER the worker template, so the agent's specific identity takes precedence over the generic worker instructions. Before: identity → org_context → worker_template (LLM ignores identity) After: worker_template → identity → org_context (LLM follows identity)
Add task_list and task_update tools to workers that have delegation enabled (DelegationConfig present). This allows delegated workers to: - Check subordinate task status via task_list - Wait for subordinate tasks to complete before marking themselves done - Read and synthesize subordinate results Fixes the issue where Planning Lead marked itself 'done' immediately after delegating to Engineering Assistant, losing the results.
Update all 5 presets to explicitly mention the task_list tool so the LLM knows how to poll for subordinate task completion. Before: 'Check the task store' (vague — LLM didn't know which tool) After: 'Use the task_list tool to check the status of tasks you created'
TaskUpdateTool doesn't have a new() constructor — it has for_worker() which requires a worker_id parameter. Fix the delegated worker tool registration to use the correct constructor.
TaskListTool::new takes impl Into<String> but AgentId (Arc<str>) doesn't implement Into<String>. Use .to_string() to convert properly.
agent_id is moved into memory_save_with_events, so it's not available for the delegation tools. Use agent_id_for_delegation (cloned before the move) for SendAgentMessageTool, TaskListTool, and TaskUpdateTool.
Add task_get tool that allows workers to read full details of specific tasks by task number. Access is restricted to tasks owned by or created by the calling agent (prevents reading superior's tasks). Registered in create_worker_tool_server for delegated workers only (inside the DelegationConfig block). Solves the result relay problem where Planning Lead couldn't read Engineering Assistant's findings after delegation — it could poll task_list for status but had no way to read the actual output.
Update README.md, AGENTS.md, and agents.mdx to document: - Cortex delegation detection via delegated_by metadata - Identity injection (SOUL.md, IDENTITY.md, ROLE.md) after worker template - org_context injection showing subordinates/superiors/peers - DelegationConfig adding send_agent_message, task_list, task_get, task_update - task_get access control (only reads tasks owned by or created by caller) - Anti-bounce rules: Environmental Blockers, No Status Check Tasks, Wait for Subordinate Results, Trust Your Subordinates - Complete delegation flow: Boss → Planning Lead → Engineering Assistant → task_list polling → task_get reading → synthesis → report - Updated config examples with preset field
Fix Boss worker bounce behavior where workers: - Poll task_list 15+ times with different filters before giving up - Report permission errors as final outcomes instead of waiting - Create duplicate tasks for the same objective - Try to do project management work (tracking task progress) New rules: - Delegate ONCE and wait — do NOT create multiple tasks for same objective - Do NOT poll excessively — call task_list ONCE, then wait - Permission errors are NOT failures — another agent is handling it - Trust completion notifications — cortex auto-notifies on completion - One delegation at a time — let Planning Lead handle chain of command - Do NOT create follow-up tasks for subordinates
Add patience rules to Planning Lead, Engineering Assistant, Research Analyst, and Project Manager (Boss Agent already had them). All 5 agents now consistently: - Delegate ONCE and wait — no duplicate tasks - Do NOT poll excessively — call task_list ONCE then wait - Treat permission errors as progress, not failures - Trust completion notifications from cortex - One delegation at a time — respect chain of command - Do NOT create follow-up tasks for subordinates
Add is_assignee check to access control so agents can read tasks where assigned_agent_id matches their agent ID, in addition to tasks they own or created. Fixes the delegation bounce loop where Engineering Assistant couldn't read tasks delegated to it by Planning Lead.
Add originating_channel field to DelegationConfig so delegated workers can propagate the conversation ID through the delegation chain. This enables the cortex's notify_delegation_completion function to inject completion notifications back to the delegating agent. Without this, originating_channel is null for all delegated tasks, causing completion notifications to be skipped and forcing agents to poll repeatedly for task status.
Set originating_channel to the channel's conversation ID when creating the send_agent_message tool. This ensures all tasks created by the channel have the correct originating_channel, which propagates through the delegation chain and enables the cortex's notify_delegation_completion to inject completion notifications back to the delegating agent. Without this, originating_channel is null and completion notifications are silently skipped.
…bounce Add parent_task_number to DelegationConfig and SendAgentMessageTool so that when a delegated worker creates sub-tasks, they carry a reference to their parent task. When the cortex completes a child task, it automatically marks the parent task as done. Changes: - DelegationConfig: added parent_task_number field - SendAgentMessageTool: added with_parent_task_number setter and includes parent_task_number in task metadata - worker.rs: extracts task_number from task_metadata and passes it as parent_task_number when creating sub-tasks - cortex.rs: auto-completes parent task when child task completes This eliminates the bounce pattern where Planning Lead workers created follow-up tasks (spacedriveapp#189, spacedriveapp#191, spacedriveapp#192) to check on delegated task spacedriveapp#188.
Add programmatic hierarchical behavior rules to agent prompts based on link structure, eliminating dependency on ROLE.md content for delegation compliance. When an agent has hierarchical links, behavioral rules (anti-bounce, synthesis, escalation) are automatically injected. - New template: prompts/en/fragments/hierarchical_rules.md.j2 Agent-agnostic rules using dynamic superior/subordinate names - engine.rs: HierarchicalRulesContext, HierarchicalLinkedAgent structs build_hierarchical_rules_for_agent(), render_hierarchical_rules() - text.rs: template registration - channel.md.j2: hierarchical_rules placeholder after org_context - channel.rs: build_hierarchical_rules() method, prompt assembly - cortex.rs: inject rules into delegated worker prompts - api/channels.rs: pass None for hierarchical_rules - presets: remove duplicate sections from boss-agent and planning-lead ROLE.md files (Patience and Synchronization, Wait for Subordinate Results, No Status Check Tasks)
…ical agents Previously, hierarchical rules were only injected into delegated workers (workers with delegated_by metadata). Agents with hierarchical links could bypass this by spawning workers directly, resulting in workers with zero hierarchical context (no org chart, no delegation rules, no subordinate awareness). Now, hierarchical rules are injected into ANY worker spawned by an agent with hierarchical links, regardless of delegation metadata. Identity and org_context injection remains delegated-only (correct behavior).
The previous commit referenced agent_has_hierarchical_links() which was never defined. Replace with inline links_for_agent().iter().any() check. Also removed a duplicate injection block that was accidentally introduced.
…chical agents - Update agents.mdx: document hierarchical behavior injection, clarify that send_agent_message (not spawn_worker) is the correct delegation tool, and add explicit warning about spawn_worker bypassing the delegation chain and breaking completion notifications - Update hierarchical_rules.md.j2: add explicit rules for agents with subordinates to use send_agent_message, wait for system notifications, and synthesize results before responding to users - Update spawn_worker_description.md.j2: add warning for hierarchical agents to use send_agent_message instead of spawn_worker for delegation
Agents with hierarchical links now see subordinate/superior role descriptions from config, enabling informed delegation decisions. - Add agent_roles map to AgentDeps (parallel to agent_names) - Populate agent_roles at startup from AgentInfo.role - Add role field to HierarchicalLinkedAgent struct - Pass agent_roles through build_org_context_for_agent() and build_hierarchical_rules_for_agent() - Update org_context template to show agent roles - Update hierarchical_rules template to list subordinates/superiors with their role descriptions - Update channel.rs and cortex.rs call sites to pass agent_roles
…rengthen anti-bounce rules - Increase completion notification truncation from 500 to 3000 chars so bosses receive meaningful summaries instead of truncated excerpts - Add guidance in truncation message to use task_get for full results - Strengthen hierarchical rules template with explicit forbidden status-check task examples including 'time-sensitive reply' and 'progress check' patterns that were observed in practice
… polling Add explicit instructions to hierarchical rules telling agents to STOP calling tools after delegating. The cortex will inject a system notification when subordinates complete — no polling needed. - Add 'respond with status and STOP calling tools' rule - Add 'call task_list ONCE then STOP' rule - Clarify that cortex injects completion notifications automatically
Add a blocking tool that waits for a delegated task to reach a terminal state (done/failed/backlog) with a configurable timeout (default 600s). Agents call this once instead of polling task_get/task_list 15+ times. - New tool: src/tools/wait_for_task.rs Polls every 5s with exponential backoff, returns when task completes or timeout is reached. Includes same access control as task_get. - Registered in tools.rs mod, pub use, and DelegationConfig - Updated hierarchical_rules.md.j2 to reference wait_for_task
…before signaling completion
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
e4db114 to
43d2ccf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Spacebot supports running multiple agents with different roles and responsibilities. But when agents need to work together in a hierarchy — where Agent A delegates to Agent B, which delegates to Agent C — the system broke down. Agents didn't know how to delegate properly, didn't report results back up the chain, and created endless status-check loops.
This PR implements a general-purpose hierarchical behavior system that works for any tiered agent structure, not just specific presets. Whether you have a 2-tier setup (Manager → Worker) or a 5-tier org chart (Director → Manager → Team Lead → Engineer → Intern), the system automatically equips every agent with the right rules based on its position in the hierarchy.
The Problem (Generalized)
When agents are linked hierarchically:
task_getortask_list15+ times instead of waiting for notificationsThe result: broken delegation chains, missing findings, and agents creating tasks to check on other tasks.
The Solution
A programmatic hierarchical behavior injection system that:
How It Works (Any Hierarchy)
Key: Every agent in the chain automatically gets:
Core Features
1. Position-Based Rule Injection
Rules are injected based on link structure, not agent names or presets:
2. Universal Anti-Bounce Protection
All hierarchical agents follow these rules:
wait_for_task()— Block internally instead of pollingtask_list()calls — Call once with broad filters, then wait for notifications3. Mandatory Findings Reporting (Critical)
Before signaling completion, agents MUST:
task_update(metadata={...})with complete findingsset_status(kind: "outcome")Why this matters: Without metadata, superiors can't see the work and will create follow-up tasks. This prevents the "I can't see your findings" infinite loop.
4. Synthesis Requirements
Agents with subordinates must:
5. Role Awareness
Every agent sees:
Example org context:
Configuration (Any Hierarchy)
Result: Each agent automatically gets rules for its position:
What Changed
New Files:
prompts/en/fragments/hierarchical_rules.md.j2— Agent-agnostic hierarchical rules template (works for ANY hierarchy)src/tools/wait_for_task.rs— Blocking tool that eliminates polling spamModified Files:
src/prompts/engine.rs—HierarchicalRulesContext,build_hierarchical_rules_for_agent()src/prompts/text.rs— Template registrationprompts/en/channel.md.j2—{{ hierarchical_rules }}placeholdersrc/agent/channel.rs— Build and pass hierarchical rulessrc/agent/cortex.rs— Inject rules into delegated worker promptspresets/boss-agent/ROLE.md— Removed duplicate rules (now injected)presets/planning-lead/ROLE.md— Removed duplicate rules (now injected)Use Cases (Beyond Boss/Planning Lead)
This system enables any hierarchical workflow:
Research Pipeline:
Content Production:
Customer Support:
Software Development:
Media Production:
In every case, the system automatically:
Testing Evidence
Before (any hierarchy):
After (any hierarchy):
send_agent_messageWhy This Matters
This isn't a "boss agent fix" — it's a general-purpose orchestration layer for multi-agent systems. Now you can:
Design Decisions
Why prompt-level rules instead of runtime enforcement?
Why metadata-based reporting?
Why
wait_for_taskinstead of polling?Future Work (Not Included)
wait_for_task(currently fixed 5s interval)Key Message: This is a general-purpose hierarchical system. It works for ANY tiered agent structure, not just the boss/planning-lead/assistant example we tested. If you can link agents hierarchically, they'll automatically get the right rules for their position and work together seamlessly.