8. AI Agent Guide

AI Agent Guide

The AI Agent is RedAmon's autonomous pentesting engine — a LangGraph-based system that reasons about your attack surface, selects security tools, executes exploits, and reports findings, all through a real-time chat interface. This guide walks you through every aspect of using the agent.

Opening the AI Agent

On the Graph Dashboard, click the "AI Agent" button on the right side of the toolbar
The AI Agent Drawer slides in from the right side of the screen

AI Agent Drawer

Drawer Layout

The AI Agent drawer contains several sections:

Area	Description
Header	Connection status (WiFi icon), phase badge, attack type, iteration counter, stealth toggle
Conversation History	Button to open past conversations panel
Chat Area	Scrollable area showing messages, thinking timeline, and tool executions
Input Area	Message input with Send/Stop buttons

Header Elements

Element	Description
Connection Status	Green WiFi icon = connected, red = disconnected. The agent uses a WebSocket connection
Phase Badge	Current operational phase: Informational (blue), Exploitation (red), Post-Exploitation (purple)
Attack Type	Shows "CVE", "BRUTE", or "PHISH" badge when the agent is executing an attack path
Iteration Counter	Current step number in the agent's reasoning loop
Stealth Toggle	Enable/disable stealth mode during agent operation

Sending Messages

Type your message in the input area at the bottom of the drawer.

Enter — send the message
Shift + Enter — new line (multiline input)
The textarea auto-expands as you type

What to Ask

The agent can handle a wide range of queries:

Informational queries (no exploitation):

"What vulnerabilities exist on 192.168.1.100?"
"Which technologies have critical CVEs?"
"Show me all open ports on the subdomains"
"Find all endpoints with injectable parameters"
"Summarize the attack surface for this project"

Exploitation requests:

"Exploit CVE-2021-41773 on the Apache server"
"Try brute forcing SSH on 10.0.0.5"
"Generate a phishing payload for Windows"
"Create a malicious Word document with a macro"
"Find and exploit the most critical vulnerability"
"Test the Node.js deserialization vulnerability"

The agent automatically translates natural language into Neo4j graph queries, tool commands, and exploitation workflows.

Understanding the Timeline

As the agent works, you'll see a timeline of its reasoning and actions:

Agent Timeline

Thinking Cards

Show the agent's internal reasoning — what it's considering, planning, and deciding. These are expandable to see full reasoning details.

Tool Execution Cards

Show when the agent runs a tool. Each card displays:

Element	Description
Tool name	Which tool was executed (e.g., `query_graph`, `execute_nmap`, `metasploit_console`)
Arguments	The input sent to the tool
Streaming output	Real-time output as the tool runs (updated every 5 seconds for long operations)
Analysis	The agent's interpretation of the tool's output
Actionable Findings	Key findings extracted from the output
Recommended Next Steps	What the agent suggests doing next

Todo List Widget

The agent maintains a todo list that updates as it works. Items are marked as:

Pending — not yet started
In Progress — currently being worked on
Completed — finished
Blocked — unable to proceed

The Three Phases

The agent operates in three distinct phases, each with different tool access:

Phase 1: Informational (Default)

Color: Blue

The agent gathers intelligence without any offensive actions:

Queries the Neo4j graph for attack surface data
Runs web searches for CVE details and exploit PoCs
Makes HTTP requests with curl to test endpoints
Scans ports with Naabu
Runs Nmap for service detection
Uses Nuclei for vulnerability verification

Available tools: query_graph, web_search, execute_curl, execute_naabu, execute_nmap, execute_nuclei, kali_shell

Phase 2: Exploitation

Color: Red

When the agent identifies a viable attack path, it requests a phase transition to exploitation. This requires your approval (if approval gates are enabled).

Additional tools unlocked: execute_code, execute_hydra, metasploit_console, msf_restart

Three classified attack paths + unclassified fallback:

Attack Path	Badge	Description
CVE Exploitation	CVE (orange)	The agent finds a matching Metasploit module, configures payload (reverse/bind shell), and fires the exploit
Hydra Brute Force	BRUTE (purple)	Uses THC Hydra to brute force credentials on 50+ protocols (SSH, FTP, RDP, SMB, MySQL, HTTP forms, etc.)
Phishing / Social Engineering	PHISH (pink)	Generates malicious payloads, documents, or delivery links for human targets. Supports msfvenom, Office macros, PDF, web delivery, HTA, and email sending
Unclassified Fallback	grey	For techniques that don't match the above (e.g., SQL injection, XSS, SSRF). Uses available tools generically

When an exploit succeeds, the agent records a ChainFinding(exploit_success) in the EvoGraph — recording the attack type, target IP, CVE IDs, module used, payload, and credentials discovered. This finding is linked to the attack chain step and bridged to the recon graph, making it queryable across sessions.

Phishing / Social Engineering Attack Path

The phishing attack path targets human factors rather than software vulnerabilities. Instead of firing an exploit directly, the agent generates a weaponized artifact and delivers it to the target — a person must execute it for the attack to succeed.

6-Step Workflow:

Determine target platform & delivery method — Windows/Linux/macOS/Android + standalone payload, malicious document, web delivery, or HTA delivery
Set up handler — exploit/multi/handler with matching payload, runs in background
Generate payload/document — msfvenom (exe/elf/apk/ps1/war/vba), Metasploit fileformat modules (Word/Excel/PDF/RTF/LNK), web_delivery (one-liner), or HTA server (URL)
Verify generation — confirm file exists, job is running
Deliver — chat download (docker cp), email via Python smtplib, or web link
Wait for callback — check sessions -l, transition to post-exploitation

Four generation methods:

Method	Tool	Output	Delivery
A) Standalone Payload	msfvenom via `kali_shell`	Binary/script file (exe, elf, apk, ps1, etc.)	File download or email attachment
B) Malicious Document	Metasploit fileformat modules	Weaponized Word/Excel/PDF/RTF/LNK	File download or email attachment
C) Web Delivery	`exploit/multi/script/web_delivery`	One-liner command (Python/PHP/PSH/Regsvr32)	Paste command in target's terminal
D) HTA Delivery	`exploit/windows/misc/hta_server`	URL serving an HTA payload	Target visits URL in browser

Email delivery uses execute_code with Python smtplib to send payloads as email attachments. SMTP settings (host, port, credentials) are configured in the project's Attack Paths tab. If no SMTP is configured, the agent asks the user at runtime.

The phishing path shares the same post-exploitation framework as CVE exploits — once a session opens, the agent transitions to post_exploitation with full Meterpreter interactive commands.

Deep dive: For the full payload matrix, all Metasploit fileformat modules, AV evasion techniques, SMTP configuration, troubleshooting, and example scenarios, see the Attack Paths > Phishing / Social Engineering page.

ngrok TCP Tunnel (Reverse Shells over NAT)

If your attacker machine is behind NAT or in a cloud environment, you can route reverse shell traffic through an ngrok TCP tunnel instead of manually configuring LHOST/LPORT:

Create a free account at ngrok.com and complete identity verification (required for TCP tunnels)
Add your authtoken to .env:
```
NGROK_AUTHTOKEN=your-token-here
```
Restart kali-sandbox: docker compose up -d kali-sandbox
Enable "Enable ngrok TCP Tunnel" in the project's Agent Behaviour settings

When enabled, ngrok starts automatically inside the kali-sandbox container and exposes a public TCP endpoint (e.g., tcp://7.tcp.eu.ngrok.io:12345). The agent auto-detects the public host and port from the ngrok API — LHOST and LPORT fields are hidden in the UI since they're no longer needed. All Metasploit reverse shell payloads will use the ngrok tunnel endpoint automatically.

Phase 3: Post-Exploitation

Color: Purple

After a successful exploit, the agent can transition to post-exploitation (if enabled in project settings):

Statefull mode — interactive Meterpreter commands: enumeration, lateral movement, data exfiltration
Stateless mode — re-runs exploits with different command payloads

Agent Tools Reference

The agent has access to 11 tools, each designed for a specific purpose. Tools are gated by the current operational phase (see Tool Phase Restrictions).

query_graph

Purpose: Query the Neo4j graph database using natural language.

This is the agent's primary source of truth for all reconnaissance data. The graph contains assets (domains, subdomains, IPs, ports, services), web data (endpoints, parameters, certificates, headers), intelligence (technologies, vulnerabilities, CVEs, MITRE CWE/CAPEC), GitHub secrets, and exploit results.

The agent should always check the graph first before reaching for other tools.

Phases: Informational, Exploitation, Post-Exploitation

web_search

Purpose: Search the internet for security research information via Tavily.

Use after query_graph when the agent needs external context not in the graph — CVE details, exploit PoCs, version-specific vulnerabilities, Metasploit module documentation, security advisories, or attack techniques.

Phases: Informational, Exploitation, Post-Exploitation

execute_curl

Purpose: Make HTTP requests to targets.

Primary use is reachability checks (status codes, headers). Fallback use is vulnerability probing (path traversal, LFI/RFI, header injection, SSRF) when the graph has no relevant vulnerability findings for the target.

Phases: Informational, Exploitation, Post-Exploitation

execute_naabu

Purpose: Fast port scanning.

Use only to verify that specific ports are actually open or to scan new targets not yet in the graph. For most cases, port data is already available via query_graph.

Phases: Informational, Exploitation, Post-Exploitation

execute_nmap

Purpose: Deep network scanning with service detection, OS fingerprinting, and NSE scripts.

Use when detailed service analysis is needed (-sV for version detection, -O for OS fingerprinting, -sC for default scripts, --script vuln for vulnerability scripts). Slower than Naabu but much more detailed.

Phases: Informational, Exploitation, Post-Exploitation

execute_nuclei

Purpose: Template-based CVE verification and exploitation.

YAML-based vulnerability scanner with 9,000+ community templates. Primary use is verifying if a target is vulnerable to a specific CVE. Secondary use is detecting vulnerabilities by category (rce, sqli, xss, lfi, etc.). Can verify and exploit many CVEs in a single step.

Phases: Informational, Exploitation, Post-Exploitation

kali_shell

Purpose: General shell execution in the Kali Linux sandbox.

Full bash shell access with all standard Kali tools. Use for downloading PoCs (git clone), payload generation (msfvenom), password cracking (john), SQL injection automation (sqlmap), exploit research (searchsploit), reverse/bind shells (nc, socat, rlwrap), SMB enumeration (smbclient), encoding, DNS lookups, SSH, and any Kali tool not exposed as a dedicated MCP tool.

Do not use for tasks that have a dedicated tool (curl → execute_curl, nmap → execute_nmap, etc.) or for writing multi-line scripts (use execute_code instead).

Timeout: 120 seconds.

Phases: Informational, Exploitation, Post-Exploitation

execute_code

Purpose: Write and execute multi-line code without shell escaping issues.

Code is passed as a clean string parameter, written to a file, and executed with the appropriate interpreter. This eliminates all shell escaping problems that arise when trying to run complex scripts via kali_shell.

Supported languages: Python (default), Bash, Ruby, Perl, C, C++

Timeout: 120 seconds for execution. Compiled languages (C/C++): 60 seconds compile + 120 seconds run.

Files persist at /tmp/{filename}.{ext} and can be re-run via kali_shell if needed.

Pre-installed Python Libraries

The following libraries are available inside the Kali sandbox — import them directly, no pip install needed:

Library	Import	Use Case
requests	`import requests`	HTTP requests for web exploitation, API interaction, form submission, file upload, session management
BeautifulSoup	`from bs4 import BeautifulSoup`	Parse HTML responses to extract CSRF tokens, hidden form fields, session nonces, page data, and links. Combine with `requests` to interact with web apps that require parsing before submission
PyCryptodome	`from Crypto.Cipher import AES`	Encrypt/decrypt payloads, hash manipulation, custom crypto attacks, padding oracle, key derivation
PyJWT	`import jwt`	Forge, tamper, and decode JWT tokens. Algorithm confusion attacks (none, HS256, RS256), claim manipulation
Paramiko	`import paramiko`	Programmatic SSH sessions, SFTP file transfer, SSH tunneling, remote command execution for post-exploitation
Impacket	`from impacket.smbconnection import SMBConnection`	Windows/AD attacks: SMB relay, NTLM authentication, Kerberos, secretsdump, psexec, wmiexec, dcomexec
pwntools	`from pwn import *`	Binary exploitation, remote TCP/UDP connections, shellcode generation, struct packing, ROP chain building

When to Use execute_code

Multi-line exploit scripts — custom PoC code, deserialization payloads, payload generators
Web app interaction requiring HTML parsing — fetch a login page, extract a CSRF token with BeautifulSoup, then submit credentials
JWT manipulation — decode a token, modify claims (e.g., escalate role to admin), re-sign with a known or guessed secret
Crypto attacks — decrypt intercepted traffic, craft encrypted payloads, exploit weak crypto implementations
SSH-based post-exploitation — open a Paramiko session to an already-compromised host, enumerate files, exfiltrate data
Windows/AD exploitation — use Impacket to dump secrets, enumerate shares, or execute commands via psexec/wmiexec
Binary exploitation — connect to a vulnerable service with pwntools, send crafted payloads, receive shells

Examples

Extract CSRF token and submit login form:

import requests
from bs4 import BeautifulSoup

s = requests.Session()
r = s.get('http://target/login', verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
token = soup.find('input', {'name': 'csrf_token'})['value']
r = s.post('http://target/login', data={
    'csrf_token': token,
    'username': 'admin',
    'password': 'admin'
}, verify=False)
print(r.status_code, r.url)

Forge a JWT token with algorithm confusion:

import jwt

# Decode without verification to inspect claims
token = "eyJhbGciOi..."
claims = jwt.decode(token, options={"verify_signature": False})
print("Original claims:", claims)

# Forge with 'none' algorithm (CVE-2015-9235)
forged = jwt.encode({"user": "admin", "role": "admin"}, "", algorithm="HS256")
print("Forged token:", forged)

Enumerate SMB shares with Impacket:

from impacket.smbconnection import SMBConnection

conn = SMBConnection('10.0.0.5', '10.0.0.5')
conn.login('guest', '')
for share in conn.listShares():
    name = share['shi1_netname'][:-1]
    print(f"Share: {name}")

Connect to a vulnerable service with pwntools:

from pwn import *

r = remote('10.0.0.5', 1337)
r.recvuntil(b'> ')
r.sendline(b'payload')
print(r.recvall(timeout=5).decode())

Phases: Informational, Exploitation, Post-Exploitation

execute_hydra

Purpose: Brute force password cracking with THC Hydra.

Fast, parallelized network login cracker supporting 50+ protocols (SSH, FTP, RDP, SMB, VNC, MySQL, MSSQL, PostgreSQL, Redis, MongoDB, HTTP forms, and more). See Hydra Brute Force for configuration options.

Phases: Exploitation, Post-Exploitation

metasploit_console

Purpose: Execute Metasploit Framework commands.

Full access to the Metasploit console — module context and sessions persist between calls. Use for exploit execution, session management, post-exploitation modules, and payload generation. Chain commands with semicolons (;), not &&.

Phases: Exploitation, Post-Exploitation

msf_restart

Purpose: Restart the Metasploit console.

Resets module context and clears stale state. Use when the console becomes unresponsive or when switching between unrelated exploit workflows.

Phases: Exploitation, Post-Exploitation

Agent Container Runtimes

The agent container ships with a full set of language runtimes and development tools. These are available for any agent workload that needs to build, test, or interact with code repositories.

Runtime	Version	Commands
Node.js	20 LTS	`node`, `npm`, `npx`, `yarn`, `pnpm`
Python	3.11	`python3`, `pip`
Go	1.22	`go build`, `go test`, `go mod`
Ruby	3.3	`ruby`, `gem`, `bundler`
Java	OpenJDK 21	`java`, `javac`, `mvn`
PHP	8.4	`php`, `composer`
.NET	SDK 8.0	`dotnet build`, `dotnet test`
Build tools	—	`make`, `gcc`, `g++`
Utilities	—	`git`, `ripgrep (rg)`, `jq`, `curl`, `wget`, `unzip`, `file`, `ssh`

Approval Workflows

When the agent wants to transition to a more aggressive phase, it pauses and sends an Approval Request.

The approval request includes:

Reason — why the agent wants to transition
Planned actions — what it intends to do
Risks — potential impact

You have three options:

Action	Description
Approve	Allow the phase transition — agent continues with offensive tools
Modify	Approve with modifications — add constraints or redirect the approach
Abort	Deny the transition — agent stays in the current phase

Approval gates are configurable per project. You can disable them in the Agent Behaviour tab of project settings to let the agent operate fully autonomously.

Question Requests

Sometimes the agent needs additional information from you. It sends a Question Request with:

The question text
Optional predefined answer choices

You can select a predefined answer or type a custom response.

Guidance Messages

You can steer the agent while it's working by sending a guidance message:

Type your guidance in the input area while the agent is actively processing
The guidance is injected into the agent's context before its next reasoning step
Examples: "Focus on SSH vulnerabilities", "Skip the web application, look at network services", "Try a different exploit module"

The agent acknowledges guidance with a confirmation message.

Stop and Resume

Stopping the Agent

Click the Stop button (replaces the Send button while the agent is working) to pause execution. The agent's state is checkpointed.

Resuming

After stopping, a Resume button appears. Click it to continue from the last checkpoint with full context preserved.

Conversation History

The agent supports multiple conversations per project. Each conversation is an independent session with its own context.

Viewing Past Conversations

Click the history button (clock icon) in the drawer header
A Conversation History panel slides in showing all past conversations

Each conversation shows:

Title (auto-generated from the first message)
Status (active, completed)
Agent running indicator
Current phase
Iteration count
Timestamp

Switching Conversations

Click on any conversation to load it. The chat area updates with the full message history.

Deleting Conversations

Click the delete icon on any conversation to remove it permanently.

Starting a New Conversation

Click the "New Conversation" button at the top of the history panel.

Downloading Session Reports

You can export any conversation as a Markdown report:

Click the download button (download icon) in the drawer header
The report is saved as a .md file containing:
- All user messages and agent responses
- Thinking/reasoning steps
- Tool executions with output
- Findings and recommendations
- Todo list states

Connection Status

The AI Agent uses a WebSocket connection for real-time communication.

Icon	Status	Meaning
Green WiFi	Connected	WebSocket is active, agent is reachable
Red WiFi (crossed)	Disconnected	Connection lost — messages won't send

If disconnected, the agent will attempt to reconnect. You can also try refreshing the page.

Tips for Effective Use

Start with informational queries — ask the agent to summarize the attack surface before requesting exploits
Be specific — "Exploit CVE-2021-41773 on 10.0.0.5:8080" works better than "hack the server"
Use guidance — steer the agent if it's going in the wrong direction
Check the todo list — it shows what the agent is planning and what's done
Review tool output — expand tool execution cards to see raw output
Use approval gates — keep them enabled until you're comfortable with the agent's behavior

Agent Configuration

Key settings that control agent behavior (configured in project settings > Agent Behaviour tab):

Setting	Default	Description
LLM Model	claude-opus-4-6	The AI model powering the agent
Max Iterations	100	Maximum reasoning-action loops
Approval for Exploitation	true	Require your approval before exploitation
Approval for Post-Exploitation	true	Require your approval before post-exploitation
Post-Exploitation Type	statefull	Meterpreter sessions vs. one-shot commands
Tool Output Max Chars	20000	Truncation limit for tool output

Full configuration reference: Project Settings Reference > Agent Behavior

Next Steps

Project Settings Reference — fine-tune every parameter
AI Model Providers — configure different AI models for the agent
Attack Surface Graph — understand the graph schema the agent queries
EvoGraph — Attack Chain Evolution — how the agent's actions are tracked as persistent, evolutionary attack chains

RedAmon GitHub Repository | Report an Issue | Back to Home

Home

User Guide

Reference

Help

Troubleshooting

8. AI Agent Guide

AI Agent Guide

Opening the AI Agent

Drawer Layout

Header Elements

Sending Messages

What to Ask

Understanding the Timeline

Thinking Cards

Tool Execution Cards

Todo List Widget

The Three Phases

Phase 1: Informational (Default)

Phase 2: Exploitation

Phishing / Social Engineering Attack Path

ngrok TCP Tunnel (Reverse Shells over NAT)

Phase 3: Post-Exploitation

Agent Tools Reference

query_graph

web_search

execute_curl

execute_naabu

execute_nmap

execute_nuclei

kali_shell

execute_code

Pre-installed Python Libraries

When to Use execute_code

Examples

execute_hydra

metasploit_console

msf_restart

Agent Container Runtimes

Approval Workflows

Question Requests

Guidance Messages

Stop and Resume

Stopping the Agent

Resuming

Conversation History

Viewing Past Conversations

Switching Conversations

Deleting Conversations

Starting a New Conversation

Downloading Session Reports

Connection Status

Tips for Effective Use

Agent Configuration

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally