This guide provides an in-depth look at PentestGPT's architecture, including design patterns, data flow, component interactions, and architectural decisions.
- Architectural Overview
- Design Patterns
- Core Components
- Data Flow
- Event-Driven Architecture
- Session Lifecycle
- Backend Abstraction
- Extension Points
- Performance Considerations
┌─────────────────────────────────────────────────────────────────┐
│ User Interface │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ TUI (Textual) │ │
│ │ - ActivityFeed - SplashScreen - HelpScreen │ │
│ │ - StatusBar - Input - QuitScreen │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
↕ (EventBus - Pub/Sub)
┌─────────────────────────────────────────────────────────────────┐
│ Controller Layer │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AgentController (Orchestrator) │ │
│ │ - 5-State Lifecycle (IDLE→RUNNING→PAUSED→COMPLETED) │ │
│ │ - Pause/Resume Control │ │
│ │ - Session Management │ │
│ │ - Flag Detection │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
↕
┌─────────────────────────────────────────────────────────────────┐
│ Agent Layer │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ PentestAgent │ │
│ │ - Task Execution │ │
│ │ - Activity Tracking (Tracer) │ │
│ │ - Flag Detection │ │
│ │ - Walkthrough Generation │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
↕
┌─────────────────────────────────────────────────────────────────┐
│ Backend Layer │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AgentBackend (Abstract Interface) │ │
│ │ ├─ ClaudeCodeBackend (Claude SDK) │ │
│ │ ├─ (Future) OpenAIBackend │ │
│ │ └─ (Future) LocalLLMBackend │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
↕
┌─────────────────────────────────────────────────────────────────┐
│ External Services │
│ - Claude Code CLI │
│ - LLM APIs (Anthropic, OpenRouter) │
│ - Local LLM Servers (LM Studio, Ollama) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Persistence Layer │
│ - SessionStore (~/.pentestgpt/sessions/) │
│ - Configuration (~/.config/, .env) │
│ - Debug Logs (/workspace/pentestgpt-debug.log) │
│ - Telemetry (Langfuse) │
└─────────────────────────────────────────────────────────────────┘
PentestGPT follows a layered architecture with clear separation of concerns:
-
Presentation Layer (
pentestgpt/interface/)- TUI components (Textual framework)
- CLI entry point
- User input handling
-
Application Layer (
pentestgpt/core/controller.py)- Business logic orchestration
- State management
- Session lifecycle
-
Domain Layer (
pentestgpt/core/agent.py)- Core agent logic
- Task execution
- Flag detection
-
Infrastructure Layer (
pentestgpt/core/backend.py)- External service integration
- LLM communication
- Tool execution
-
Persistence Layer (
pentestgpt/core/session.py)- Session storage
- Configuration management
- Logging
Used For: Global state management with thread-safe access
Implementations:
# EventBus Singleton
class EventBus:
_instance = None
_lock = threading.Lock()
@classmethod
def get(cls) -> "EventBus":
with cls._lock:
if cls._instance is None:
cls._instance = cls()
return cls._instanceSingletons in PentestGPT:
EventBus.get()- Global event busget_global_tracer()- Global activity tracerget_registry()- Global tool registry
Rationale:
- Single source of truth for events, activities, tools
- Thread-safe access from multiple components
- Easy testing (can reset singleton)
Used For: Decoupled communication between TUI and agent
Implementation: EventBus
# Subscribe to events
bus = EventBus.get()
bus.subscribe(EventType.MESSAGE, self._on_message)
bus.subscribe(EventType.STATE_CHANGED, self._on_state_change)
# Emit events
bus.emit_message("Starting enumeration", "info")
bus.emit_state("running", "Target: 10.10.11.234")Benefits:
- TUI and agent are completely decoupled
- No direct dependencies between components
- Easy to add new subscribers (logging, telemetry, etc.)
- Thread-safe event propagation
Event Types:
STATE_CHANGED- Agent state transitionsMESSAGE- Text output from agentTOOL- Tool execution eventsFLAG_FOUND- Flag detection eventsUSER_COMMAND- User control commands (pause/resume/stop)USER_INPUT- User instruction input
Used For: Framework-agnostic backend creation
Implementation:
class AgentBackend(ABC):
@abstractmethod
async def connect(self) -> None: ...
@abstractmethod
async def query(self, prompt: str) -> None: ...
@abstractmethod
def receive_messages(self) -> AsyncIterator[AgentMessage]: ...
# Concrete implementation
class ClaudeCodeBackend(AgentBackend):
async def connect(self):
# Claude SDK specific logic
...Benefits:
- Easy to swap LLM backends
- Future support for OpenAI, Gemini, local LLMs
- Testing with mock backends
- Consistent interface across backends
Used For: Different execution modes (TUI vs Raw vs Non-Interactive)
Implementation:
# main.py - mode selection
if args.raw:
# Raw streaming mode
await run_raw_mode(controller)
elif args.non_interactive:
# Non-interactive mode
await run_non_interactive(controller)
else:
# TUI mode (default)
app = PentestGPTApp(...)
await app.run_async()Benefits:
- Same agent logic, different presentation
- Easy to add new modes
- Separation of execution from presentation
Used For: Agent lifecycle management
States:
class AgentState(Enum):
IDLE = "idle"
RUNNING = "running"
PAUSED = "paused"
COMPLETED = "completed"
ERROR = "error"State Transitions:
IDLE → RUNNING (start)
RUNNING → PAUSED (user pause or agent request)
PAUSED → RUNNING (user resume)
RUNNING → COMPLETED (flags found)
RUNNING → ERROR (fatal error)State-Specific Behavior:
IDLE- Waiting to startRUNNING- Active task execution, cannot accept new instructionsPAUSED- Accepts user input, maintains contextCOMPLETED- Final state, session savedERROR- Error state, session saved with error details
Used For: Tool execution abstraction
Implementation:
class BaseTool(ABC):
@abstractmethod
async def execute(self, *args, **kwargs) -> dict[str, Any]:
"""Execute the tool with given arguments."""
pass
# Usage
tool = registry.get("terminal_execute")
result = await tool.execute(command="nmap -sV 10.10.11.234")Benefits:
- Uniform tool interface
- Easy to add new tools
- Trackable execution (via EventBus)
- Async execution support
Purpose: Central orchestrator for the entire agent lifecycle
Responsibilities:
- State management (5-state model)
- Session creation and persistence
- Pause/resume control
- Flag detection coordination
- Error handling and recovery
Key Methods:
async def start(self) -> None:
"""Start agent execution"""
async def pause(self) -> None:
"""Pause agent at message boundary"""
async def resume(self, instruction: str | None) -> None:
"""Resume with optional new instruction"""
async def stop(self) -> None:
"""Stop agent and save session"""State Machine:
def _transition_state(self, new_state: AgentState, details: str = ""):
if self._is_valid_transition(self.state, new_state):
self.state = new_state
EventBus.get().emit_state(new_state.value, details)
self._update_session_status()Purpose: Wraps LLM agent with penetration testing capabilities
Responsibilities:
- Task execution via Claude Code SDK
- Flag detection in outputs
- Walkthrough generation
- Activity tracking
- Debug logging
Execution Flow:
async def execute(self, task: str) -> dict[str, Any]:
# 1. Initialize Claude SDK client
options = ClaudeAgentOptions(...)
self.client = ClaudeSDKClient(options)
# 2. Create system prompt
system_prompt = get_ctf_prompt(task)
# 3. Query agent
await self.client.query(system_prompt)
# 4. Stream messages and detect flags
async for message in self.client.stream():
self._process_message(message)
self._detect_flags(message)
# 5. Return results
return {"walkthrough": self.walkthrough_steps, "flags": self.flags_found}Flag Detection:
FLAG_PATTERNS = [
r"flag\{[^\}]+\}",
r"HTB\{[^\}]+\}",
# ... more patterns
]
def _detect_flags(self, text: str) -> list[str]:
flags = []
for pattern in self.FLAG_PATTERNS:
matches = re.findall(pattern, text)
flags.extend(matches)
return flagsPurpose: Decoupled communication between components
Architecture:
class EventBus:
_handlers: dict[EventType, list[Callable]]
def subscribe(self, event_type, handler):
"""Register handler for event type"""
def emit(self, event: Event):
"""Emit event to all subscribers"""Thread Safety:
def emit(self, event: Event) -> None:
with self._handler_lock:
handlers = self._handlers.get(event.type, []).copy()
for handler in handlers:
with contextlib.suppress(Exception):
handler(event) # Don't let one handler break othersEvent Flow:
Agent → EventBus → [TUI, Tracer, Langfuse]
↓ ↓ ↓
Display Log Telemetry
Purpose: Persist and restore agent sessions
Storage Format:
@dataclass
class SessionInfo:
session_id: str
target: str
status: SessionStatus
flags_found: list[dict]
# ... more fieldsFile Operations:
def save(self, session: SessionInfo) -> None:
path = self.SESSIONS_DIR / f"{session.session_id}.json"
with open(path, "w") as f:
json.dump(session.to_dict(), f, indent=2)
def load(self, session_id: str) -> SessionInfo:
path = self.SESSIONS_DIR / f"{session_id}.json"
with open(path) as f:
data = json.load(f)
return SessionInfo.from_dict(data)1. CLI Entry (main.py)
├─> Parse arguments
├─> Load configuration
└─> Initialize telemetry
2. Create Components
├─> EventBus singleton
├─> Tracer singleton
├─> AgentController
│ └─> Backend (ClaudeCodeBackend)
└─> TUI App (PentestGPTApp)
3. Subscribe to Events
├─> TUI subscribes to STATE_CHANGED, MESSAGE, TOOL, FLAG_FOUND
├─> Tracer subscribes for logging
└─> Langfuse subscribes for telemetry
4. Start Agent
└─> Controller.start()
└─> Agent.execute(task)
┌─────────────┐
│ LLM Backend │
└──────┬──────┘
│ Messages (text, tool_use, result)
↓
┌─────────────┐
│ PentestAgent│
└──────┬──────┘
│ Parse messages
│ Detect flags
│ Track activity
↓
┌─────────────┐
│ EventBus │
└──────┬──────┘
│ Emit events
├──────────────┬──────────────┬──────────────┐
↓ ↓ ↓ ↓
┌───────┐ ┌────────┐ ┌────────┐ ┌─────────┐
│ TUI │ │ Tracer │ │Langfuse│ │ Session │
└───────┘ └────────┘ └────────┘ └─────────┘
Display Log/Debug Telemetry Persist
User presses Ctrl+P
↓
TUI emits USER_COMMAND (pause)
↓
EventBus propagates to Controller
↓
Controller sets pause flag
↓
Agent completes current message
↓
Controller transitions to PAUSED state
↓
EventBus emits STATE_CHANGED (paused)
↓
TUI shows input field
↓
User types instruction and presses Enter
↓
TUI emits USER_INPUT
↓
Controller receives input
↓
Controller.resume(instruction)
↓
Agent continues with new context
Agent receives message from LLM
↓
PentestAgent._detect_flags(text)
↓
Regex patterns match flag
↓
EventBus.emit_flag(flag, context)
↓
┌──────────┬──────────┬──────────┐
↓ ↓ ↓ ↓
TUI Tracer Session Langfuse
Display Log Save Telemetry
STATE_CHANGED:
# Emitted by: Controller
# Data: {"state": "running", "details": "Target: 10.10.11.234"}
# Subscribers: TUI, Langfuse
bus.emit_state("running", "Target: 10.10.11.234")MESSAGE:
# Emitted by: Agent
# Data: {"text": "Starting nmap scan", "type": "info"}
# Subscribers: TUI, Tracer
bus.emit_message("Starting nmap scan", "info")TOOL:
# Emitted by: Agent
# Data: {"status": "start", "name": "bash", "args": {"command": "nmap ..."}}
# Subscribers: TUI, Tracer, Langfuse
bus.emit_tool("start", "bash", {"command": "nmap -sV 10.10.11.234"})FLAG_FOUND:
# Emitted by: Agent
# Data: {"flag": "flag{example}", "context": "user.txt"}
# Subscribers: TUI, Session, Langfuse
bus.emit_flag("flag{example}", "Found in /home/user/user.txt")USER_COMMAND:
# Emitted by: TUI
# Data: {"command": "pause"}
# Subscribers: Controller
bus.emit_command("pause")USER_INPUT:
# Emitted by: TUI
# Data: {"text": "Focus on port 8080"}
# Subscribers: Controller
bus.emit_input("Focus on port 8080")TUI Event Handlers:
def _setup_event_handlers(self) -> None:
bus = EventBus.get()
bus.subscribe(EventType.MESSAGE, self._handle_message)
bus.subscribe(EventType.STATE_CHANGED, self._handle_state)
bus.subscribe(EventType.TOOL, self._handle_tool)
bus.subscribe(EventType.FLAG_FOUND, self._handle_flag)
def _handle_message(self, event: Event) -> None:
text = event.data.get("text", "")
msg_type = event.data.get("type", "info")
self.activity_feed.add_message(text, msg_type)
def _handle_flag(self, event: Event) -> None:
flag = event.data.get("flag", "")
self.activity_feed.add_flag(flag)
self.show_flag_notification(flag)# 1. Controller starts
controller = AgentController(config, backend)
# 2. Create session
session = SessionInfo(
session_id=str(uuid.uuid4()),
target=target,
created_at=datetime.now(),
status=SessionStatus.RUNNING,
model=config.model
)
# 3. Save to disk
SessionStore().save(session)# During execution
session.flags_found.append({"flag": "flag{...}", "context": "..."})
session.updated_at = datetime.now()
session.total_cost_usd += message_cost
# Save periodically
SessionStore().save(session)# Load session
session = SessionStore().load(session_id)
# Restore state
controller.session = session
controller.target = session.target
# Resume backend if supported
if backend.supports_resume:
await backend.resume(session.backend_session_id)class AgentBackend(ABC):
"""Framework-agnostic backend interface"""
async def connect(self) -> None:
"""Establish connection to agent"""
async def query(self, prompt: str) -> None:
"""Send query to agent"""
def receive_messages(self) -> AsyncIterator[AgentMessage]:
"""Stream messages from agent"""
async def resume(self, session_id: str) -> bool:
"""Resume previous session"""class MessageType(Enum):
TEXT = "text"
TOOL_START = "tool_start"
TOOL_RESULT = "tool_result"
RESULT = "result"
ERROR = "error"
@dataclass
class AgentMessage:
type: MessageType
content: Any
tool_name: str | None = None
tool_args: dict | None = NoneBenefits:
- Backend-agnostic message handling
- Easy to add new backends
- Consistent testing interface
class ClaudeCodeBackend(AgentBackend):
async def connect(self):
from claude_agent_sdk import ClaudeSDKClient
options = ClaudeAgentOptions(...)
self._client = ClaudeSDKClient(options)
async def query(self, prompt: str):
await self._client.query(prompt)
async def receive_messages(self):
async for message in self._client.stream():
yield self._convert_to_agent_message(message)Create Tool:
# pentestgpt/tools/sqlmap.py
from pentestgpt.tools.base import BaseTool
class SQLMapTool(BaseTool):
def __init__(self):
super().__init__(
name="sqlmap",
description="Automated SQL injection tool"
)
async def execute(self, url: str, **kwargs) -> dict:
# Execute sqlmap
result = await run_sqlmap(url)
return {"success": True, "result": result}Register Tool:
# pentestgpt/tools/registry.py
from pentestgpt.tools.sqlmap import SQLMapTool
class ToolRegistry:
def _register_default_tools(self):
self.register(TerminalTool())
self.register(SQLMapTool()) # Add new toolDefine Event:
# pentestgpt/core/events.py
class EventType(Enum):
# Existing events...
CREDENTIAL_FOUND = auto() # New eventEmit Event:
# Anywhere in code
bus = EventBus.get()
bus.emit(Event(
EventType.CREDENTIAL_FOUND,
{"username": "admin", "password": "password123"}
))Subscribe:
# In TUI or other component
bus.subscribe(EventType.CREDENTIAL_FOUND, self._handle_credential)Implement Interface:
# pentestgpt/core/openai_backend.py
class OpenAIBackend(AgentBackend):
async def connect(self):
import openai
self.client = openai.AsyncClient(api_key=...)
async def query(self, prompt: str):
await self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
async def receive_messages(self):
# Stream and convert to AgentMessage
...Use New Backend:
# pentestgpt/interface/main.py
if args.backend == "openai":
backend = OpenAIBackend(...)
else:
backend = ClaudeCodeBackend(...)All I/O is async:
# Agent execution
async def execute(self, task: str):
await self.client.query(task)
async for message in self.client.stream():
await self._process_message(message)
# Controller
async def start(self):
await self.backend.connect()
await self.backend.query(prompt)Benefits:
- Non-blocking I/O
- Efficient resource usage
- Concurrent operations
Handler Isolation:
for handler in handlers:
with contextlib.suppress(Exception):
handler(event) # Failures don't break other handlersThread Safety:
with self._handler_lock:
handlers = self._handlers.get(event.type, []).copy()
# Process outside lock to minimize lock timeWrite-Through Cache:
- Sessions saved immediately on updates
- No risk of data loss
- Fast reads from memory
Async Writes:
async def save_async(self, session: SessionInfo):
await asyncio.to_thread(self._write_to_disk, session)Lazy Rendering:
- Only visible items rendered
- Scroll performance optimized
- CSS-based styling (compiled once)
Message Batching:
def add_messages(self, messages: list[str]):
# Batch update instead of individual updates
self.messages.extend(messages)
self.refresh()Decision: Use pub/sub event bus for TUI-Agent communication
Rationale:
- Decoupling: TUI doesn't depend on Agent internals
- Extensibility: Easy to add new subscribers (logging, telemetry)
- Testing: Can test components independently
- Thread Safety: Centralized synchronization
Trade-offs:
- Slightly more complex code
- Debugging requires tracing events
- Small performance overhead (acceptable for this use case)
Decision: Use abstract AgentBackend protocol instead of direct Claude SDK usage
Rationale:
- Future-Proof: Support multiple LLM providers
- Testing: Mock backends for unit tests
- Flexibility: Switch backends without changing agent logic
Trade-offs:
- Extra abstraction layer
- Potential over-engineering for single backend
Conclusion: Worth it for long-term maintainability
Decision: Use IDLE/RUNNING/PAUSED/COMPLETED/ERROR states
Rationale:
- Clear Semantics: Each state has well-defined meaning
- UI State Mapping: Easy to reflect in TUI
- Error Handling: Explicit error state
- Pause/Resume: Clean separation of active/paused
Alternative Considered: Simple running/stopped binary Rejected Because: Insufficient for pause/resume and error tracking
Decision: Store sessions as JSON files in ~/.pentestgpt/sessions/
Rationale:
- Simplicity: No database dependency
- Portability: Easy to backup, inspect, share
- Debugging: Human-readable JSON
- Reliability: No DB connection issues
Trade-offs:
- Not suitable for thousands of sessions (acceptable for this use case)
- No concurrent access control (single-user tool)
- PROJECT_STRUCTURE.md - Project organization
- FEATURES.md - Feature documentation
- CODE_GUIDE.md - Code walkthroughs and explanations