Skip to content

Refine 'Tools as Primitives' principle to acknowledge valid use cases for logic in tools #71

@kieranklaassen

Description

@kieranklaassen

Summary

The current agent-native-architecture skill takes a strong position that tools should be pure primitives with all behavior defined in prompts. After auditing CORA against this principle, I believe we need a more nuanced stance that acknowledges valid use cases for logic in tool descriptions.

Current Position (SKILL.md lines 26-33)

### Tools Provide Capability, Not Behavior

Tools should be primitives that enable capability. The prompt defines what to do with that capability.

**Wrong:** `generate_dashboard(data, layout, filters)` — agent executes your workflow
**Right:** `read_file`, `write_file`, `list_files` — agent figures out how to build a dashboard

The Case FOR Logic in Tool Descriptions

1. Smaller Master Prompts

Tool descriptions are modular context. When you include search strategy in SearchEmails:

description <<~DESC
  TRIPLE SEARCH PHILOSOPHY: LITERAL → CONTEXTUAL → INTERPRETIVE
  • 🔍 LITERAL: Use plain keywords...
  • 🧠 CONTEXTUAL: Add synonyms, related terms...
  ...
DESC

This content is only loaded when the tool is available. If you move it to the master prompt, it's always present, even when searching isn't relevant to the conversation.

Trade-off: Tool descriptions = JIT context loading. Master prompt = always-present context.

2. Tools as Composable Behavior Modules

Tools can be enabled/disabled dynamically. This means:

  • Include SearchEmails with its search strategy → agent knows HOW to search
  • Exclude SearchEmails → agent doesn't have search capability OR strategy
  • Swap in SearchEmailsSimple → different strategy, no code change

The tool + its description is a composable unit. Separating them (primitive tool + prompt behavior) breaks this composability.

3. Harness Pattern: Code as Guardrails

Sometimes you WANT the tool to enforce behavior:

# ChangeCategory tool in CORA
def execute(email_id:, category_id:, new_category_name:)
  ActiveRecord::Base.transaction do
    # Step 1: Change category
    # Step 2: Create sender rule  
    # Step 3: Reclassify brief emails
    # Step 4: Broadcast refresh
  end
end

This orchestration is intentional. When a user changes a category, we ALWAYS want a rule created. This is a business invariant, not a decision the agent should make.

Principle: Use code harnesses for business invariants. Use prompts for flexible behavior.

4. Portability

A tool with its behavior in the description is self-contained. You can:

  • Share it between projects
  • Version it independently
  • Test it in isolation

If behavior lives in the master prompt, the tool is incomplete without its corresponding prompt section.


The Deeper Question: What IS a Tool?

Here's where the current guidance breaks down: it conflates the technical mechanism of a tool call with the conceptual role tools play in agent systems.

A tool call is just a function invocation. A "tool" is whatever you wire up to that call.

Consider what happens when Claude Code uses the Task tool to launch a sub-agent:

Task({
  subagent_type: "explore",
  prompt: "Find all files that handle authentication"
})

On the wire, this is a tool call. The API sees tool_use with name Task and some parameters. But conceptually? You've just spawned an autonomous agent that will run for minutes, make its own decisions, call its own tools, and return a synthesized answer.

Is that "a tool"?

Technically: yes.
Philosophically: it's an agent with agency.

Tools as JIT Context Injection

Think of tool descriptions not as documentation, but as just-in-time context injection.

When the model sees a tool, it doesn't just learn "I can call this function." It absorbs the description as part of its working context. A 200-line tool description is 200 lines of context that:

  • Only loads when that tool is available
  • Can be swapped by swapping the tool
  • Disappears when the tool is removed

This is modular prompting. The "master prompt" becomes smaller because domain-specific guidance lives in the tools that need it.

┌─────────────────────────────────────┐
│         Master Prompt               │  ← Core identity, general behavior
│    (small, stable, universal)       │
└─────────────────────────────────────┘
              ↓ tool injection ↓
┌───────────┐ ┌───────────┐ ┌───────────┐
│ SearchTool│ │ WriteTool │ │ AnalyzeTool│  ← Each carries its own context
│ + strategy│ │ + style   │ │ + method   │
└───────────┘ └───────────┘ └───────────┘

The current "primitives only" guidance would have all that strategy, style, and method in the master prompt. But that means:

  • You can't swap search strategies by swapping tools
  • Removing a tool doesn't remove its behavioral guidance
  • The master prompt grows with every capability

Sub-Agents Complicate Everything

The real breakdown: sub-agents triggered via tool calls.

// This is a tool call...
Task({ 
  subagent_type: "security-sentinel",
  prompt: "Audit this code for vulnerabilities"
})

// ...that spawns THIS (an agent with its own system prompt, tools, and loop)

The security-sentinel agent might run for 5 minutes. It has its own personality, its own tools, its own judgment. It's not a "primitive" in any meaningful sense.

Yet it's invoked via tool_use. The API doesn't distinguish between "read a file" and "spawn an autonomous agent that will make 50 decisions."

Implication: When we say "tools should be primitives," we're ignoring that tool calls are the mechanism for EVERYTHING—including spawning complex, stateful, multi-turn agents.

A More Precise Framework

Instead of "tools should be primitives," consider:

What you're building Mechanism Guidance
Data access Tool call Pure primitive (read/write/list)
Domain operation Tool call May encode business invariants
Behavior guidance Tool description JIT context injection—often preferable to master prompt
Sub-agent Tool call Definitely not a "primitive"—embrace the complexity
Workflow Tool call OK if it's a business invariant; avoid if it's flexible behavior

The Real Question to Ask

Don't ask "Is this a primitive?"

Ask: "Who should own this decision?"

Owner Implementation
Business rule (always this way) Code in tool execution
Tool-specific strategy (usually this way) Tool description
Conversation-dependent (depends on context) Master prompt
Agent judgment (trust the model) Neither—let it decide

Proposed Refinement

Replace Binary "Primitives vs Workflows" with a Spectrum

Pattern When to Use Example
Pure Primitive Maximum flexibility needed read_file, write_file
Guided Primitive Tool + description with strategy SearchEmails with search philosophy
Domain Primitive Storage/retrieval with domain schema StoreMemory, CreateCategory
Orchestrated Action Business invariants that must always happen together ChangeCategory (always creates rule)
Sub-Agent Trigger Complex autonomous task Task with agent type
Workflow Tool Complex multi-step with decisions Avoid unless intentionally constraining

New Guidance

### Tools: More Than Just Functions

A tool call is a mechanism. What you wire to it determines its nature.

**Pure primitives** (read, write, list): Maximum flexibility. Agent decides everything.

**Guided primitives** (tool + rich description): The description is JIT context injection. 
Use this when behavior is tool-specific and you want it portable with the tool.

**Domain operations** (create_category, change_status): May encode business invariants.
If something must ALWAYS happen together, put it in code.

**Sub-agent triggers** (Task tool, specialized agents): Not primitives at all. 
These spawn autonomous agents with their own judgment. Embrace the complexity.

**Decision framework:**
- Who should own this decision?
- Is this always true (code) or context-dependent (prompt)?
- Should this guidance travel with the tool (description) or stay global (master prompt)?

CORA Examples

Good: Logic in Tool Description (SearchEmails)

The 213-line description teaches search strategy. This is appropriate because:

  • Strategy is specific to email searching
  • Portable with the tool
  • Reduces master prompt size
  • Can swap for different strategy by swapping tool

Good: Harness Pattern (ChangeCategory)

4-step orchestration is appropriate because:

  • Business invariant: category changes SHOULD create rules
  • Atomic: all steps must succeed or fail together
  • Not a decision point for the agent

Potentially Problematic: GetGmailLabelStats

Encodes "HIGH_VOLUME" vs "NEEDS_CLEANUP" categorization in code:

def inbox_state(count)
  case count
  when 0..100 then "READY"
  when 101..500 then "NEEDS_CLEANUP"
  else "HIGH_VOLUME"
  end
end

This IS a decision the agent could make contextually. User with 200 emails who checks daily ≠ user with 200 emails who checks weekly.

Recommendation: Return raw stats, let agent (or prompt) interpret.


Proposed Changes

  1. Update SKILL.md essential_principles to reflect the spectrum, not binary
  2. Add "What IS a Tool?" section clarifying technical vs conceptual distinction
  3. Add "JIT Context Injection" as a valid pattern for tool descriptions
  4. Acknowledge sub-agents as a special case that breaks the "primitive" model
  5. Add "Decision Framework" based on ownership, not tool structure
  6. Rename principle from "Tools Provide Capability, Not Behavior" to something more nuanced

Discussion

  • Is the JIT context injection framing compelling?
  • Should we acknowledge sub-agents as a distinct category?
  • Are there cases where the current "primitives only" guidance is clearly better?
  • How do we balance simplicity of guidance vs accuracy of the nuance?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions