Skip to content

feat: Tool Discovery & Dynamic Tool Creation #123

@lsm

Description

@lsm

Purpose

Enable NeoKai to discover, compose, and create new tools dynamically rather than being limited to a fixed set of specialists. This is essential for AGI-level autonomy because:

  • Capability expansion: Handling novel tasks without pre-built solutions
  • Domain adaptation: Creating domain-specific tools on demand
  • Efficiency optimization: Composing optimal tool chains for specific tasks
  • Extensibility: Growing capabilities without code changes

Without dynamic tool creation, NeoKai is limited to pre-defined capabilities and cannot adapt to novel domains.


Current State

NeoKai has:

  • Fixed set of 7 specialists (Coordinator, Coder, Debugger, Tester, Reviewer, VCS, Verifier, Executor)
  • 3 SDK built-in tools (Bash, Read, Edit)
  • Hardcoded tool definitions in prompts
  • No tool composition
  • No dynamic tool creation

The specialist types are defined in packages/daemon/src/lib/agent/specialist-types.ts and cannot be modified at runtime.


Proposed Approach

Phase 1: Tool Registry & Discovery

  1. Tool Registry

    interface Tool {
      id: string;
      name: string;
      description: string;
      
      // Capabilities
      inputSchema: JSONSchema;
      outputSchema: JSONSchema;
      capabilities: string[];
      
      // Usage
      handler: ToolHandler;
      costEstimate: CostEstimate;
      typicalDuration: Duration;
      
      // Metadata
      version: string;
      author: 'builtin' | 'user' | 'generated';
      usageCount: number;
      successRate: number;
    }
    
    interface ToolRegistry {
      // Registration
      register(tool: Tool): void;
      unregister(toolId: string): void;
      
      // Discovery
      findByCapability(capability: string): Tool[];
      findByInputType(type: string): Tool[];
      search(query: string): Tool[];
      
      // Composition
      compose(tools: Tool[], workflow: CompositionWorkflow): ComposedTool;
    }
  2. Capability Taxonomy

    const capabilityTaxonomy = {
      file_operations: ['read', 'write', 'edit', 'search'],
      code_analysis: ['parse', 'lint', 'typecheck', 'complexity'],
      execution: ['run', 'test', 'build', 'deploy'],
      version_control: ['commit', 'branch', 'merge', 'revert'],
      analysis: ['explain', 'summarize', 'compare', 'trace'],
      generation: ['code', 'docs', 'tests', 'configs'],
      communication: ['notify', 'report', 'clarify', 'escalate']
    };

Phase 2: Tool Composition Engine

  1. Composition Patterns

    type CompositionPattern = 
      | 'sequential'    // A → B → C
      | 'parallel'      // A, B, C in parallel
      | 'conditional'   // if X then A else B
      | 'loop'          // repeat A while condition
      | 'fan_out'       // A → [B, C, D]
      | 'fan_in';       // [A, B, C] → D
    
    interface ComposedTool extends Tool {
      composition: {
        pattern: CompositionPattern;
        components: Tool[];
        dataFlow: DataFlowEdge[];
        errorHandling: ErrorHandler[];
      };
    }
  2. Composition Builder

    interface CompositionBuilder {
      // Build from natural language description
      fromDescription(description: string): Promise<ComposedTool>;
      
      // Suggest compositions for a task
      suggestForTask(task: Task): Promise<ComposedTool[]>;
      
      // Validate composition
      validate(composition: ComposedTool): ValidationResult;
    }
  3. Example Compositions

    const exampleCompositions = {
      // "Analyze and fix"
      analyzeAndFix: {
        pattern: 'sequential',
        components: ['analyzer', 'planner', 'coder', 'tester'],
        dataFlow: [
          { from: 'analyzer.output', to: 'planner.input' },
          { from: 'planner.output', to: 'coder.input' },
          { from: 'coder.output', to: 'tester.input' }
        ]
      },
      
      // "Comprehensive review"
      comprehensiveReview: {
        pattern: 'fan_in',
        components: ['linter', 'typeChecker', 'securityScanner', 'complexityAnalyzer'],
        dataFlow: [
          { from: 'all.outputs', to: 'aggregator.input' }
        ]
      }
    };

Phase 3: Dynamic Tool Generation

  1. Tool Generation Framework

    interface ToolGenerator {
      // Generate tool from specification
      generate(spec: ToolSpecification): Promise<Tool>;
      
      // Generate tool from examples
      fromExamples(examples: InputOutputExample[]): Promise<Tool>;
      
      // Generate tool from natural language
      fromDescription(description: string): Promise<Tool>;
    }
    
    interface ToolSpecification {
      name: string;
      description: string;
      inputSchema: JSONSchema;
      outputSchema: JSONSchema;
      behavior: string;  // Natural language description
      constraints: string[];
      examples: InputOutputExample[];
    }
  2. Generated Tool Implementation

    interface GeneratedTool extends Tool {
      generation: {
        prompt: string;  // The prompt that implements this tool
        model: string;   // Model to use
        examples: InputOutputExample[];
      };
      
      // Handler is LLM-based
      handler: async (input) => {
        return llm.execute(generation.prompt, input);
      };
    }
  3. Tool Validation

    interface ToolValidator {
      // Validate generated tool
      validate(tool: GeneratedTool): Promise<ValidationResult>;
      
      // Test with provided examples
      testExamples(tool: GeneratedTool): Promise<TestResult[]>;
      
      // Check for safety issues
      safetyCheck(tool: GeneratedTool): Promise<SafetyReport>;
    }

Phase 4: Specialist Evolution

  1. Dynamic Specialist Creation

    interface SpecialistGenerator {
      // Create specialist for a domain
      createForDomain(domain: string): Promise<SpecialistType>;
      
      // Create specialist from tool set
      fromTools(tools: Tool[], role: string): Promise<SpecialistType>;
    }
    
    interface DynamicSpecialist extends SpecialistType {
      dynamic: true;
      createdAt: Date;
      createdBy: 'user' | 'auto_generated';
      performance: PerformanceMetrics;
      retirementCriteria: RetirementCriteria;
    }
  2. Specialist Lifecycle

    interface SpecialistLifecycle {
      // Monitor specialist performance
      monitor(specialist: DynamicSpecialist): PerformanceMetrics;
      
      // Decide if specialist should evolve
      shouldEvolve(metrics: PerformanceMetrics): boolean;
      
      // Evolve specialist based on learnings
      evolve(specialist: DynamicSpecialist): Promise<DynamicSpecialist>;
      
      // Retire underperforming specialist
      retire(specialist: DynamicSpecialist): void;
    }

Phase 5: MCP Integration for External Tools

  1. MCP Tool Discovery

    interface MCPToolDiscovery {
      // Discover tools from connected MCP servers
      discover(): Promise<MCPTool[]>;
      
      // Register MCP tools in registry
      registerMCPTools(tools: MCPTool[]): void;
      
      // Sync with MCP server
      sync(serverId: string): Promise<SyncResult>;
    }
  2. MCP Tool Wrapper

    interface MCPToolWrapper {
      // Wrap MCP tool for internal use
      wrap(mcpTool: MCPTool): Tool;
      
      // Handle MCP-specific features
      handleMCPSpecific(tool: Tool, input: any): Promise<any>;
    }

Phase 6: Tool Effectiveness Tracking

  1. Usage Analytics

    interface ToolAnalytics {
      // Track tool usage
      recordUsage(toolId: string, usage: ToolUsage): void;
      
      // Analyze effectiveness
      analyzeEffectiveness(toolId: string): EffectivenessReport;
      
      // Suggest improvements
      suggestImprovements(toolId: string): Improvement[];
    }
    
    interface EffectivenessReport {
      successRate: number;
      avgDuration: number;
      avgCost: number;
      commonErrors: Error[];
      userSatisfaction: number;
    }
  2. Tool Optimization

    interface ToolOptimizer {
      // Optimize tool based on usage
      optimize(tool: Tool): Promise<Tool>;
      
      // Suggest better alternatives
      suggestAlternatives(tool: Tool): Promise<Tool[]>;
    }

Technical Considerations

Safety & Security

  • Preventing malicious tool generation
  • Sandboxing generated tools
  • Input/output validation for generated tools

Quality Assurance

  • Ensuring generated tools work correctly
  • Testing generated tools thoroughly
  • Versioning and rollback for generated tools

Performance

  • Caching generated tool implementations
  • Optimizing composed tool execution
  • Avoiding tool explosion (too many tools)

Discoverability

  • Helping the Coordinator find the right tool
  • Avoiding confusion between similar tools
  • Tool naming conventions

Success Metrics

  1. Tool Utility: % of generated tools that are actually useful
  2. Coverage Expansion: Tasks solvable before vs after dynamic tools
  3. Composition Efficiency: Performance of composed vs single tools
  4. Adoption Rate: Usage of generated vs built-in tools

Implementation Roadmap

  1. Phase 1: Tool registry and discovery
  2. Phase 2: Tool composition engine
  3. Phase 3: Basic tool generation
  4. Phase 4: Dynamic specialist creation
  5. Phase 5: MCP integration
  6. Phase 6: Analytics and optimization

Questions for Discussion

  1. What safeguards should be in place for generated tools?
  2. How to prevent tool proliferation (too many similar tools)?
  3. Should users be able to curate/delete generated tools?
  4. How to handle conflicts between generated and built-in tools?

Part of the AGI-Level Autonomy initiative

Metadata

Metadata

Assignees

No one assigned

    Labels

    agi-foundationCore components for AGI-level autonomyenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions