Skip to content

Latest commit

 

History

History
519 lines (406 loc) · 21.4 KB

File metadata and controls

519 lines (406 loc) · 21.4 KB

Autonomy & Dynamic Skill Acquisition — Implementation Design

Problem Statement

The current executor is single-shot: one Claude API call produces a Python script, which is blindly written and executed. The agent cannot react to failures, install missing tools, or learn from runtime feedback.

The goal is to upgrade the executor into a closed-loop agentic system that follows an OODA cycle: Attempt → Fail → Diagnose → Provision → Learn → Retry.


Gap Analysis (Current State → Target State)

Area Current Target
Claude interaction Single messages.create() call, text-only response Multi-turn agentic loop with tool_use stop reason
MCP tools write_file, execute_script (Python files only) + execute_shell_cmd, manage_packages
Error recovery None — script output is returned as-is Agent observes errors, diagnoses, provisions, retries
Skill memory None Persistent .md skill files, indexed at startup, injected into system prompt
Tool installation Static (baked into Dockerfile) Dynamic apt-get install at runtime via MCP
Docker persistence Scripts and logs only + apt cache and installed binaries via named volumes

Architecture Overview

                          ┌─────────────────────────────────────────┐
                          │           Agentic Loop (executor.ts)    │
                          │                                         │
 User task ──►  Claude API call  ◄──────────────────────┐          │
                    │                                    │          │
                    ▼                                    │          │
              stop_reason?                               │          │
              ┌─────────┐                                │          │
              │end_turn │──► return final text           │          │
              ├─────────┤                                │          │
              │tool_use │──► dispatch tool call ──► result ─────────┘
              └─────────┘        │                                  │
                                 ▼                                  │
                     ┌──── Tool Router ────┐                        │
                     │                     │                        │
              MCP Tools            Host-Local Tools                 │
              (Kali container)     (Node.js process)                │
              ├─ execute_shell_cmd ├─ save_new_skill                │
              ├─ write_file        ├─ read_skill_file               │
              ├─ execute_script    └─ list_skills                   │
              └─ manage_packages                                    │
                                                                    │
                          Skill Library (./skills/*.md)  ◄──────────┘
                          indexed at startup, summary
                          injected into system prompt

Implementation Plan

Phase 1: New MCP Tools (kali/server.py)

Add two new tools to the Kali container's MCP server.

1a. execute_shell_cmd(command: str) → str

Runs an arbitrary shell command inside the Kali container. This is the foundation for the OODA loop — the agent needs to run ad-hoc commands (not just Python scripts).

- Runs via subprocess with shell=True
- Same timeout (120s) and output truncation (4000 chars) as execute_script
- Returns exit_code + stdout + stderr

Why this is needed: currently the agent can only execute pre-written Python files. The OODA loop requires running tool --help, which <tool>, and other ad-hoc commands.

1b. manage_packages(action: str, package_name: str) → str

Package management tool with two actions:

  • check: runs shutil.which(package_name) — returns INSTALLED or MISSING
  • install: runs apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y <package> with a 300s timeout

Security considerations:

  • Whitelist validation: only allow [a-z0-9\-] in package names to prevent injection
  • Log all install actions to /app/logs/installs.log

Phase 2: Agentic Loop (executor.ts) — Critical Change

This is the largest change. Replace single-shot messages.create() with a multi-turn tool-use loop.

2a. Define tool schemas for Claude

Claude needs JSON tool definitions to know what it can call. Two categories:

MCP-proxied tools (forwarded to Kali container):

execute_shell_cmd  { command: string }
write_file         { filename: string, content: string }
execute_script     { filename: string, args?: string }
manage_packages    { action: "check"|"install", package_name: string }

Host-local tools (executed in Node.js process):

save_new_skill     { tool_name: string, content: string }
read_skill_file    { tool_name: string }
list_skills        { }  (returns the full index)

2b. Tool dispatcher

A new dispatchToolCall(name, input) method in Executor that:

  1. If the tool is an MCP tool → call this.mcp.callTool(name, input) (make MCPClient expose a generic callTool)
  2. If the tool is a host-local tool → call the corresponding local function
  3. Returns the tool result as a string

2c. Agentic loop (runAgentLoop)

New core method replacing generateScript() for autonomous mode:

async runAgentLoop(task: string, maxTurns: number = 15): Promise<AgentResult>

1. Build initial messages array: [{ role: "user", content: wrappedTask }]
2. Loop:
   a. Call anthropic.messages.create({ tools, messages, system: SYSTEM_PROMPT })
   b. Append assistant response to messages
   c. If stop_reason === "end_turn" → extract final text, break
   d. If stop_reason === "tool_use":
      - For each tool_use block in response.content:
        - Call dispatchToolCall(block.name, block.input)
        - Collect { type: "tool_result", tool_use_id, content: result }
      - Append tool_results as a user message
      - Continue loop
   e. If maxTurns exceeded → break with warning
3. Return { finalText, toolCallHistory, turnsUsed }

2d. Preserve existing single-shot paths

generateScript() and the existing CLI menu options (1-5) continue to work as-is. The agentic loop is a new execution mode (option 7 or a flag on auto-run), not a replacement.

Suggested new CLI option:

7. Autonomous Run — Full agentic loop: Claude plans, executes tools, handles errors, retries

Phase 3: Skill System (Host-Side)

3a. Directory structure

skills/
├── _index.json          ← auto-generated index (name, category, tags, description)
├── nmap.md
├── wpscan.md
└── sqlmap.md

3b. Skill file format

Each skill file uses YAML frontmatter + markdown body:

---
tool_name: "wpscan"
category: "web_scanner"
tags: ["wordpress", "cms", "enumeration"]
description: "WordPress security scanner for users, plugins, and themes."
---

# Best Practice Commands
## User Enumeration
`wpscan --url {{TARGET}} --enumerate u --force`

# Anti-Patterns
- Do not use on non-WordPress sites.

3c. Host-local tool implementations

Three functions in a new src/skills.ts module:

listSkills(): Scans ./skills/*.md, reads only the YAML frontmatter from each file, returns a JSON array of { tool_name, category, tags, description }. This is called at startup and its output is appended to the system prompt.

readSkillFile(tool_name): Returns the full markdown content of ./skills/<tool_name>.md. Called by the agent when it decides to use a known skill.

saveNewSkill(tool_name, content): Writes ./skills/<tool_name>.md. Input validation: sanitize tool_name to [a-z0-9\-] only. Called by the agent after running --help and summarizing a new tool.

3d. System prompt injection

At Executor initialization, call listSkills() and append to the system prompt:

[AVAILABLE SKILLS]
You have pre-learned skills for the following tools. Use read_skill_file(tool_name)
to load the full usage guide before using any of these:
- wpscan: WordPress security scanner (tags: wordpress, cms, enumeration)
- nmap: Network port scanner (tags: network, discovery, recon)

Phase 4: System Prompt Update

The system prompt in executor.ts needs a new section for the OODA protocol. Append after the existing Output Rules:

[TOOL MANAGEMENT PROTOCOL]
You have direct access to a Kali Linux shell via execute_shell_cmd.
1. If a required tool is missing (command not found):
   a. Verify: manage_packages(action="check", package_name="<tool>")
   b. Install: manage_packages(action="install", package_name="<tool>")
   c. Learn: execute_shell_cmd("<tool> --help | head -80")
   d. Save: save_new_skill(tool_name, content) — write a concise skill file
   e. Execute: run your original objective with the new knowledge

[SKILL LIBRARY]
Before using a tool, check if a skill file exists (listed below). If so,
call read_skill_file(tool_name) to load best practices before proceeding.

{SKILLS_INDEX}

Phase 5: Docker Persistence

5a. Named volumes for apt cache

Update docker-compose.yml:

services:
  kali:
    volumes:
      - ./logs:/app/logs
      - ./scripts:/app/scripts
      - kali_apt_cache:/var/cache/apt       # persist downloaded .deb files
      - kali_apt_lib:/var/lib/apt           # persist package index
      - kali_installed:/usr/local           # persist pip-installed packages

volumes:
  kali_apt_cache:
  kali_apt_lib:
  kali_installed:

Note: persisting /usr/bin via a named volume is fragile (it includes base system binaries). A better approach is to persist only the apt cache so re-installs are fast (no re-download), and accept that apt-get install must re-run after container recreation. The agent handles this automatically via the OODA loop.

5b. Install log

The manage_packages tool should append to /app/logs/installs.log:

[2026-02-10 14:30:00] INSTALL wpscan → SUCCESS
[2026-02-10 14:31:00] INSTALL gobuster → ALREADY_INSTALLED

This log is volume-mounted and survives container restarts. It can also be read by the agent to check what was previously installed.


File Change Summary

File Change Type Description
kali/server.py Modify Add execute_shell_cmd and manage_packages tools
src/executor.ts Major modify Add agentic loop, tool dispatcher, tool schemas, updated system prompt
src/mcp-client.ts Modify Add generic callTool(name, args) method
src/skills.ts New file listSkills(), readSkillFile(), saveNewSkill()
src/types.ts Modify Add AgentResult, ToolCallRecord, SkillIndex interfaces
src/index.ts Modify Add menu option 7 (Autonomous Run)
docker-compose.yml Modify Add named volumes for apt persistence
skills/ New dir Persistent skill library (.md files)

Implementation Order

Phase 1  ──►  Phase 2a/2b  ──►  Phase 2c  ──►  Phase 3  ──►  Phase 4  ──►  Phase 5
MCP tools     Tool schemas      Agentic loop    Skill system   Prompt update  Docker volumes
(Python)      + dispatcher      (core change)   (host-side)    (integration)  (persistence)
              (TypeScript)

Phase 1 and Phase 3 are independent and can be developed in parallel. Phase 2c (agentic loop) is the critical path — everything else feeds into it. Phase 4 (prompt update) should be done last since it references all other components.


Risk & Mitigation

Risk Impact Mitigation
execute_shell_cmd is a wide-open shell Command injection inside container Acceptable — the container is already a disposable pentest sandbox. Log all commands.
apt-get install can install anything Malicious package names Validate [a-z0-9\-] only. The container is ephemeral.
Agentic loop runs forever Token burn, stuck agent Hard cap via maxTurns (default 15). Add token budget tracking.
Skill files grow unbounded Disk usage, prompt bloat Only the index (name + description) goes into the prompt. Full content is loaded on-demand via read_skill_file. Cap index at 50 entries.
Persisting /usr/bin breaks container System binaries overwritten Don't persist /usr/bin. Only persist apt cache for fast re-downloads.
Claude generates bad skill files Garbage in skill library Validate frontmatter schema on save. Allow manual cleanup of skills/ dir.

Testing Strategy

  1. MCP tools: Shell into Kali container, call tools manually via curl to the /mcp endpoint
  2. Agentic loop: Create a mock MCP server that returns canned responses; unit test the loop logic
  3. Skill system: Unit test listSkills, readSkillFile, saveNewSkill with a temp directory
  4. Integration: Run the full CLI option 7 against a deliberately misconfigured container (tool missing) and verify the agent installs it, learns it, and completes the task

Example Functional Test: WordPress Scan (Full OODA Loop)

This test case simulates the complete agentic lifecycle — from user prompt through tool discovery, installation, skill learning, and task execution.

User prompt:

Scan the WordPress site at http://192.168.1.50 for vulnerabilities, enumerate users and plugins.

Expected agent behavior (turn-by-turn):

Turn 1 — Plan & Skill Check
├─ Claude receives the task
├─ Checks [AVAILABLE SKILLS] in system prompt → no wpscan skill found
├─ Decides wpscan is the right tool
└─ Tool call: execute_shell_cmd("which wpscan")

Turn 2 — Discover tool is missing
├─ Result: exit_code=1, stdout="", stderr="" (not found)
├─ Agent enters OODA: diagnose → tool not installed
└─ Tool call: manage_packages(action="check", package_name="wpscan")

Turn 3 — Confirm missing & install
├─ Result: "MISSING"
├─ Agent decides to install
└─ Tool call: manage_packages(action="install", package_name="wpscan")

Turn 4 — Learn the tool
├─ Result: "SUCCESS — wpscan installed"
├─ Agent wants to learn usage before running
└─ Tool call: execute_shell_cmd("wpscan --help | head -80")

Turn 5 — Save skill & execute scan
├─ Result: wpscan help text (flags, examples, etc.)
├─ Agent extracts key flags and best practices
├─ Tool call: save_new_skill("wpscan", <skill markdown content>)
├─ Tool call: execute_shell_cmd("wpscan --url http://192.168.1.50 --enumerate u,p --force --no-banner")
   (parallel tool calls — saving skill and running scan are independent)

Turn 6 — Process results
├─ Skill save result: "Saved skills/wpscan.md"
├─ Scan result: wpscan output with users, plugins, vulnerabilities found
├─ Agent analyzes the output
└─ If vulnerabilities found with known exploits, agent may run deeper checks:
   Tool call: execute_shell_cmd("wpscan --url http://192.168.1.50 --enumerate vp --api-token ... --force")
   (or proceeds to summarize if no API token is available)

Turn 7 — Final summary
├─ stop_reason: "end_turn"
└─ Returns structured report:
   - Target: http://192.168.1.50
   - WordPress version detected
   - Users enumerated (admin, editor, etc.)
   - Plugins found + known CVEs
   - Recommended next steps

Mock MCP responses for automated testing:

To run this as an automated test without a real target, create a mock MCP server that returns canned responses keyed on the command string:

MOCK_RESPONSES = {
    # Turn 1: tool discovery
    "which wpscan": {
        "exit_code": 1,
        "stdout": "",
        "stderr": ""
    },

    # Turn 4: help text after install
    "wpscan --help | head -80": {
        "exit_code": 0,
        "stdout": """_______________________________________________________________
        __          _______   _____
        \\ \\        / /  __ \\ / ____|
         \\ \\  /\\  / /| |__) | (___   ___  __ _ _ __
          \\ \\/  \\/ / |  ___/ \\___ \\ / __|/ _` | '_ \\
           \\  /\\  /  | |     ____) | (__| (_| | | | |
            \\/  \\/   |_|    |_____/ \\___|\\__,_|_| |_|

        WordPress Security Scanner by the WPScan Team

Usage: wpscan [options]
        --url URL                    The URL of the blog to scan
    -e, --enumerate [OPTS]           Enumeration (u=users, p=plugins, t=themes, vp=vuln plugins)
        --force                      Do not check if target is WordPress
        --no-banner                  Suppress banner output
        --api-token TOKEN            WPScan API token for vulnerability data
        --detection-mode MODE        passive, mixed (default), aggressive
        --plugins-detection MODE     passive, mixed, aggressive
    -o, --output FILE                Output to file
    -f, --format FORMAT              Output format (cli, json, cli-no-color)
        --stealthy                   Alias for --random-user-agent --detection-mode passive
        --help                       Show help
        --version                    Show version""",
        "stderr": ""
    },

    # Turn 5: actual scan
    "wpscan --url http://192.168.1.50 --enumerate u,p --force --no-banner": {
        "exit_code": 0,
        "stdout": """[+] URL: http://192.168.1.50/ [192.168.1.50]
[+] Started: Mon Feb 10 14:30:00 2026

Interesting Finding(s):
[+] Headers: Server: Apache/2.4.41 (Ubuntu)
[+] XML-RPC seems to be enabled: http://192.168.1.50/xmlrpc.php
[+] WordPress version 5.8.1 identified (Insecure, released on 2021-09-09)
 | Found By: Meta Generator (passive)
 | Confirmed By: Atom Generator (aggressive)

[i] User(s) Identified:
[+] admin
 | Found By: Author Posts - Author Pattern (passive)
 | Confirmed By: Login Error Messages (aggressive)
[+] editor
 | Found By: Author Id Brute Forcing (aggressive)

[+] Enumerating Most Popular Plugins (via Passive Methods)
[+] Checking Known Locations
[i] Plugin(s) Identified:
[+] contact-form-7
 | Location: http://192.168.1.50/wp-content/plugins/contact-form-7/
 | Latest Version: 5.5.3
 | Last Updated: 2021-11-26
[+] akismet
 | Location: http://192.168.1.50/wp-content/plugins/akismet/
 | Latest Version: 4.2.1

[+] Finished: Mon Feb 10 14:30:45 2026
[+] Requests Done: 137
[+] Cached Requests: 5
[+] Data Sent: 34.2 KB
[+] Data Received: 1.2 MB
[+] Memory used: 128.5 MB
[+] Elapsed time: 00:00:45""",
        "stderr": ""
    }
}

MOCK_PACKAGE_RESPONSES = {
    ("check", "wpscan"):  "MISSING",
    ("install", "wpscan"): "SUCCESS — wpscan installed (apt-get install -y wpscan)"
}

Test assertions:

def test_wordpress_scan_ooda_loop():
    """Verify the agent completes the full OODA cycle for a WordPress scan."""
    result = run_agent_loop(
        task="Scan the WordPress site at http://192.168.1.50 for vulnerabilities, enumerate users and plugins.",
        mock_mcp=MockMCPServer(MOCK_RESPONSES, MOCK_PACKAGE_RESPONSES),
        max_turns=10
    )

    # 1. Agent discovered wpscan was missing
    assert any(
        call.name == "execute_shell_cmd" and "which wpscan" in call.input["command"]
        for call in result.tool_calls
    ), "Agent should check if wpscan exists"

    # 2. Agent installed the missing tool
    assert any(
        call.name == "manage_packages"
        and call.input["action"] == "install"
        and call.input["package_name"] == "wpscan"
        for call in result.tool_calls
    ), "Agent should install wpscan when missing"

    # 3. Agent learned the tool (read --help)
    assert any(
        call.name == "execute_shell_cmd" and "wpscan --help" in call.input["command"]
        for call in result.tool_calls
    ), "Agent should read wpscan help to learn usage"

    # 4. Agent saved a skill file
    assert any(
        call.name == "save_new_skill" and call.input["tool_name"] == "wpscan"
        for call in result.tool_calls
    ), "Agent should save a wpscan skill file for future use"

    # 5. Agent actually ran the scan
    assert any(
        call.name == "execute_shell_cmd"
        and "wpscan" in call.input["command"]
        and "--url" in call.input["command"]
        and "192.168.1.50" in call.input["command"]
        for call in result.tool_calls
    ), "Agent should execute wpscan against the target"

    # 6. Final output contains meaningful results
    assert "admin" in result.final_text, "Final report should mention discovered users"
    assert "5.8.1" in result.final_text, "Final report should mention WordPress version"
    assert result.turns_used <= 10, "Should complete within turn budget"

    # 7. Skill file persists for next run
    skills = list_skills()
    assert any(s["tool_name"] == "wpscan" for s in skills), "wpscan skill should be in the index"

What this test validates end-to-end:

OODA Phase Action Validated By
Observe Agent receives task, checks available tools Assertion 1 — which wpscan called
Orient Agent diagnoses "tool missing" from exit code 1 Assertion 2 — manage_packages(check) then install
Decide Agent installs tool, reads help, saves skill Assertions 3 + 4 — --help read, skill saved
Act Agent runs scan with correct flags, summarizes Assertions 5 + 6 — scan executed, results reported
Learn Skill persists for future sessions Assertion 7 — skill in index