Skip to content

[Phase 4] Evaluate Subagent Context Isolation per Task #68

@frankbria

Description

@frankbria

Summary

Evaluate whether worker agents should receive fresh, isolated context per task rather than maintaining persistent context across multiple tasks, to prevent context pollution.

Background: State of the Art

From Philipp Schmid's "The Rise of Subagents":

"This addresses a key limitation of monolithic AI agents: context pollution. When a single, big and complex agent handles many tasks, its context window, number of tools, can become cluttered and less reliable."

"A subagent... has its own isolated context window."

The subagent pattern from Claude Code and Poke.com emphasizes:

  • Fresh context per task: Subagent spawns with clean slate
  • Task-specific tools: Only tools relevant to this specific task
  • Synthesized output: Returns summary to orchestrator, not raw context

This prevents Task 3's debugging context from polluting Task 10's feature implementation.

Current State in CodeFRAME

CodeFRAME's architecture:

Lead Agent → Backend Worker / Frontend Worker / Test Worker / Review Worker

Workers appear to be persistent entities that handle multiple tasks over a session. Questions:

  1. Context accumulation: Does the Backend Worker carry context from Task 1 into Task 5?
  2. Tool consistency: Are worker tool sets static or task-adapted?
  3. Context handoff: When Lead assigns a new task, is worker context reset or appended?

If workers accumulate context, a long session could see:

  • Irrelevant error messages from early tasks
  • Stale code snippets that have since changed
  • Conflicting patterns from different features

Investigation Tasks

  1. Audit current worker lifecycle

    • Trace context window contents across multiple task assignments
    • Measure context size growth per worker over a session
    • Identify what persists vs. what's task-specific
  2. Measure context pollution impact

    • Compare task success rates: early tasks vs. late tasks in long sessions
    • Look for hallucination patterns referencing old task context
    • Check if workers reference code/files that are no longer relevant
  3. Design isolation options

    Option A: Fresh context per task

    • Spawn new worker instance for each task
    • Pass only task-specific context from Lead
    • Worker returns synthesized result, then terminates
    • Pro: Maximum isolation. Con: No cross-task learning.

    Option B: Selective context reset

    • Clear task-specific context between assignments
    • Retain project-level context (architecture, patterns)
    • Pro: Balanced. Con: Harder to implement clean boundaries.

    Option C: Context summarization

    • After each task, summarize learnings into compact form
    • Clear detailed context, keep summaries
    • Pro: Preserves learning. Con: Summarization cost/accuracy.
  4. Prototype and measure

    • Implement Option A for one worker type
    • Compare task success rate and token usage vs. current approach

Success Criteria

  • Documented current worker context lifecycle
  • Measured context size growth and pollution indicators
  • Implemented isolation mechanism (if beneficial)
  • A/B comparison showing impact on task success rate

Design Considerations

Orchestrator synthesis: The Lead Agent should receive only synthesized results from workers, not raw context. This keeps the orchestrator's context clean.

Learning preservation: If workers gain valuable insights ("this codebase uses async patterns everywhere"), that should persist at the project level, not pollute individual task contexts.

Cost tradeoff: Fresh contexts mean no accumulated cache benefits. Need to measure token cost of isolation vs. current approach.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    FutureDeferred - beyond v1/v2 scope, consider for future versionsarchitectureSystem architecture and design patternscontext-engineeringContext window management and optimizationenhancementNew feature or requestpriority:medium

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions