pentest-executor/plan.md at main · cyberdiagram/pentest-executor

This project is a Proof of Concept (PoC) designed to validate the feasibility of an AI-driven execution environment. The architecture involves running a Kali Linux environment within a Docker container on a host machine. An MCP (Model Context Protocol) server will be established inside the Kali container to communicate with the host.

The host machine will run executor.ts as an agent, equipped with write_file and execute_script capabilities. When an LLM generates a PoC script, the executor.ts agent will send the script to the Kali container via the MCP server, write it to a file, execute it using Python within Kali, and return the results back to the agent.

Please complete the following tasks:

1. Project Structure

Create the complete project architecture using a combination of TypeScript and Python.

2. Dockerfile Configuration

Create the Dockerfile for the Kali environment, keeping the following requirements in mind:

2.1 File Persistence: Docker containers are ephemeral. If a tool like Nmap generates an HTML report, it will be lost once the container stops.

Solution: Use volume mounting (e.g., docker run -v $(pwd)/logs:/app/logs ...) to ensure reports are written directly to the host's hard drive.

2.2 Network Mode: Default Docker network isolation may cause network scans of the host's local network to fail.

Solution: Use --network=host (for Linux) or configure a specific network bridge in the Docker startup parameters.

2.3 Token Truncation (Output Limiting): Tools like Nmap or Gobuster can generate massive outputs (e.g., 10MB of text). You must not allow the container to send the entire Stdout back to the LLM.

Solution: Implement logic within the server.py (the MCP server inside Kali) to truncate output: if len(output) > 2000: return output[:2000] + "...(truncated)".

3. Case Simulation

Simulate a scenario by sending a prompt to the LLM to construct a payload for a PoC script. Once the LLM returns the content:

Send the script to the MCP server inside Kali to perform write_file.

Return instructions to the host on how to invoke the script (required parameters, etc.).

Implement a function in executor.ts that allows me to trigger execute_script through an interaction interface.

4. Automated Execution

Implement a function within the executor that enables the agent to automatically execute the entire workflow described above and return the final results for human review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. Project Structure

2. Dockerfile Configuration

2.1 File Persistence: Docker containers are ephemeral. If a tool like Nmap generates an HTML report, it will be lost once the container stops.

2.2 Network Mode: Default Docker network isolation may cause network scans of the host's local network to fail.

2.3 Token Truncation (Output Limiting): Tools like Nmap or Gobuster can generate massive outputs (e.g., 10MB of text). You must not allow the container to send the entire Stdout back to the LLM.

3. Case Simulation

4. Automated Execution

FilesExpand file tree

plan.md

Latest commit

History

plan.md

File metadata and controls

1. Project Structure

2. Dockerfile Configuration

2.1 File Persistence: Docker containers are ephemeral. If a tool like Nmap generates an HTML report, it will be lost once the container stops.

2.2 Network Mode: Default Docker network isolation may cause network scans of the host's local network to fail.

2.3 Token Truncation (Output Limiting): Tools like Nmap or Gobuster can generate massive outputs (e.g., 10MB of text). You must not allow the container to send the entire Stdout back to the LLM.

3. Case Simulation

4. Automated Execution