- Config File Location
- Full Config Reference
- Ollama Settings
- Agent Behavior
- Docker Sandbox
- Server Settings
- Safety Settings
- Browser Settings
- Search Settings
- Session Settings
- Environment Variable Overrides
- Configuration Presets
| Path | Purpose |
|---|---|
~/.airecon/config.json |
Primary config (auto-created on first run) |
On first run, if no config file exists, AIRecon writes the defaults to ~/.airecon/config.json. Edit this file to customize behavior.
# View current config
cat ~/.airecon/config.json
# Edit
nano ~/.airecon/config.json
# or
code ~/.airecon/config.json{
"ollama_url": "http://127.0.0.1:11434",
"ollama_model": "qwen3.5:122b",
"ollama_timeout": 2400.0,
"ollama_num_ctx": 131072,
"ollama_num_ctx_small": 65536,
"ollama_temperature": 0.15,
"ollama_num_predict": 32768,
"ollama_enable_thinking": true,
"ollama_supports_thinking": true,
"ollama_supports_native_tools": true,
"ollama_keep_alive": "60m",
"proxy_host": "127.0.0.1",
"proxy_port": 3000,
"command_timeout": 900.0,
"docker_image": "airecon-sandbox",
"docker_auto_build": true,
"tool_response_role": "tool",
"deep_recon_autostart": true,
"agent_max_tool_iterations": 800,
"agent_repeat_tool_call_limit": 2,
"agent_missing_tool_retry_limit": 2,
"agent_plan_revision_interval": 30,
"agent_exploration_mode": true,
"agent_exploration_intensity": 0.9,
"agent_exploration_temperature": 0.35,
"agent_stagnation_threshold": 2,
"agent_tool_diversity_window": 8,
"agent_max_same_tool_streak": 3,
"allow_destructive_testing": false,
"browser_page_load_delay": 1.0,
"browser_action_timeout": 120,
"searxng_url": "http://localhost:8080",
"searxng_engines": "google,bing,duckduckgo,brave,google_news,github,stackoverflow",
"vuln_similarity_threshold": 0.7,
"pipeline_recon_min_subdomains": 3,
"pipeline_recon_min_urls": 1,
"pipeline_recon_soft_timeout": 30
}Type: string | Default: "http://127.0.0.1:11434"
The HTTP endpoint of your Ollama instance. Change this if Ollama runs on a different host or port.
// Local default
"ollama_url": "http://127.0.0.1:11434"
// Remote GPU server
"ollama_url": "http://192.168.1.100:11434"
// Custom port
"ollama_url": "http://127.0.0.1:3003"Type: string | Default: "qwen3.5:122b"
The model name exactly as shown in ollama list. Must include the tag.
// Minimum recommended (30B — anything below 30B is unreliable)
"ollama_model": "qwen3:32b"
// Lower VRAM option (MoE — 30B active params)
"ollama_model": "qwen3:30b-a3b"
// High-end (best quality)
"ollama_model": "qwen3.5:122b"Important: The name must match exactly.
qwen3:32bandqwen3:latestare different entries. Runollama listto see exact names.Minimum size: 30B parameters. Models below 30B frequently fail to follow scope rules, hallucinate tool output, and produce incomplete function calls.
qwen3:14bis NOT recommended for real engagements.
Type: float | Default: 0.15
Controls output randomness. This is the single most impactful setting for agent reliability.
| Value | Effect on AIRecon |
|---|---|
0.0 |
Fully deterministic. Same input = same output every time. |
0.1–0.15 |
Recommended. Strict instruction following. Minimal hallucination. Model respects scope rules. |
0.2 |
Slightly more adaptive. Useful if model feels repetitive when stuck on a problem. |
0.3 |
Noticeable creativity. Still acceptable for tool-calling agents. |
0.5–0.6 |
High risk of scope creep (model "improvises" extra steps). Chain creep becomes frequent. |
> 0.7 |
Model frequently ignores scope rules, invents tool output, hallucinates CVEs. Not recommended. |
Why low temperature matters for security agents:
The model's job is to follow strict protocols (task scoping, CVSS scoring, PoC requirements) rather than to be creative. Higher temperature increases the chance the model "reasons itself" into skipping rules.
For reasoning models (qwen3 with ollama_enable_thinking: true), the <think> phase already handles analytical depth internally. The output temperature can therefore be very low (0.15) without losing quality.
Type: int | Default: 131072 (128K tokens)
Context window size in tokens. Larger = more history visible to the model = better continuity, but requires more VRAM.
| Value | VRAM impact | Use case |
|---|---|---|
8192 |
Minimal | Quick tests, very limited VRAM |
32768 |
Moderate | General use with 8–16 GB VRAM |
65536 |
High | Deep recon sessions, 16+ GB VRAM |
131072 |
Very high | Default — full 128K context for qwen3.5:122b, 32+ GB VRAM |
If you get VRAM/OOM errors, reduce this first. The agent uses automatic multi-level crash recovery (see VRAM Recovery below) and proactive context trimming at ≥80% usage.
Type: int | Default: 65536 (64K tokens)
A smaller context window used for compression calls (compress_with_llm) and VRAM crash recovery tiers. Reduces VRAM pressure during context management. This is also the starting point for multi-level recovery — see VRAM Recovery below.
Type: int | Default: 32768
Maximum number of tokens the model can generate in a single response. 32768 ≈ ~24,000 words — sufficient for complex reasoning + tool-calling responses.
Reduce to 8192 if responses feel slow. The agent automatically caps this further after VRAM crashes.
Type: float | Default: 2400.0 seconds (40 minutes)
How long to wait for a streaming response before giving up. Default is 40 minutes — appropriate for large 122B models on GPU with 128K context.
// For fast GPU inference
"ollama_timeout": 300.0
// For very large models (122B) on CPU
"ollama_timeout": 7200.0Type: bool | Default: true
Enables the think=true parameter when calling Ollama, which activates extended reasoning (<think> blocks) for supported models.
| Model type | Recommended setting |
|---|---|
| Reasoning model (qwen3, deepseek-r1) | true |
| Standard/chat model (llama3, mistral) | false |
When enabled, the TUI shows the model's internal reasoning process in the thinking panel, separate from the final output. This is very useful for understanding why the agent made a specific decision.
Type: bool | Default: true
When true, if the user inputs only a bare domain name (e.g., just example.com with nothing else), AIRecon automatically expands it into a full deep recon prompt:
Perform a comprehensive full deep recon and vulnerability scan on example.com. Use all available tools.
Set to false if you want the agent to treat bare domain input as "just set the target, wait for further instructions."
// Auto-expand bare domain to full recon
"deep_recon_autostart": true
// Treat bare domain as target selection only
"deep_recon_autostart": falseType: int | Default: 800
Safety limit on the number of tool call cycles per user message. Prevents infinite loops.
For full recon engagements on complex targets, 800 iterations allows Phase 1–4 to complete fully. For specific tasks, the agent typically finishes in 3–20 iterations.
// Tight limit for specific tasks only
"agent_max_tool_iterations": 100
// Extended for deep recon
"agent_max_tool_iterations": 1200Type: int | Default: 2
How many times the exact same tool + identical arguments combination is allowed before being blocked as a duplicate.
The agent maintains a count per (tool, arguments) pair per session. When the count reaches this limit, the tool call is rejected with an error message telling the agent to try something different.
// Strict: block after first repeat
"agent_repeat_tool_call_limit": 1
// Relaxed: allow up to 3 identical calls
"agent_repeat_tool_call_limit": 3Note: This only blocks identical calls (same tool + same arguments). Different arguments or a different tool on the same target are not affected.
Type: int | Default: 2
How many consecutive times the agent may call a tool that does not exist before the session is aborted.
When the agent hallucinates a tool name (e.g., calls run_nmap instead of execute), it receives an error listing the valid tools. If it continues calling non-existent tools this many times in a row, the session stops to prevent an infinite error loop.
Type: string | Default: "tool"
The message role used when returning tool results to the LLM in the conversation history.
| Value | When to use |
|---|---|
"tool" |
Models that support the Ollama tool message role (qwen3, most modern models) |
"user" |
Fallback for older models that don't understand the tool role |
Most models work correctly with "tool". If you see the model failing to parse tool results, try "user".
Type: string | Default: "airecon-sandbox"
The name of the Docker image used as the execution sandbox. Must be built before first use.
docker build -t airecon-sandbox airecon/containers/If you build with a different tag, update this setting accordingly.
Type: bool | Default: true
If true, AIRecon attempts to build the Docker image automatically at startup if it is not found. This can fail in restricted environments. Manual build is more reliable.
Type: float | Default: 900.0 seconds (15 minutes)
Maximum time a single shell command may run inside the Docker container before being killed.
// Quick scans only
"command_timeout": 120.0
// Allow long-running tools (masscan, full nmap, large sqlmap)
"command_timeout": 1800.0Nuclei, sqlmap, and full nmap scans can easily take > 10 minutes on large target lists. Increase this if commands are being killed prematurely.
Type: string / int | Defaults: "127.0.0.1" / 3000
The host and port for the internal FastAPI server that bridges the TUI and the agent loop via SSE (Server-Sent Events).
Only change these if port 3000 is already in use on your machine:
"proxy_host": "127.0.0.1",
"proxy_port": 3001Type: bool | Default: true
Enables the Phase 1 anti-stagnation exploration engine. When active:
- Monitors for stagnation (no new high-confidence evidence after N iterations)
- Boosts temperature to
agent_exploration_temperaturewhen stagnation detected - Enforces tool diversity via same-tool streak detection
- Injects per-phase exploration directives into the system prompt
Type: float (0.0–1.0) | Default: 0.9
How aggressively the exploration engine pushes the agent into new territory when stagnation is detected. Higher values inject stronger directives. 0.9 is tuned for 122B models; reduce to 0.5–0.6 for smaller models.
Type: float (0.0–2.0) | Default: 0.35
Temperature used when the agent is in exploration mode (stagnation detected). Higher than ollama_temperature to encourage new approaches without losing control.
Type: int | Default: 2
Number of consecutive iterations with no new high-confidence evidence (≥0.65 confidence) before exploration mode activates.
Type: int (min 3) | Default: 8
Number of most-recent tool calls tracked for diversity analysis. The agent checks this window for same-tool streaks.
Type: int | Default: 3
Maximum allowed consecutive uses of the same tool before a diversity warning is injected. Prevents the agent from looping on a single tool.
Type: bool | Default: false
When true, modifies the system prompt to authorize destructive/aggressive testing:
- Changes "non-destructive penetration testing" to "UNRESTRICTED DESTRUCTIVE penetration testing"
- Injects a
<safety_override>block that lifts rate limiting, politeness constraints, and adds aggressive recon directives - Zero false positive enforcement is tightened further
Set to false for passive/non-destructive engagements or when working in shared/production environments.
// Production-safe assessment
"allow_destructive_testing": false
// Full offensive engagement (authorized)
"allow_destructive_testing": trueType: float | Default: 1.0 seconds
How long to wait after a page navigation before performing browser actions. Increase for slow targets or heavily JavaScript-rendered pages.
// Fast, well-performing targets
"browser_page_load_delay": 0.5
// Slow targets or heavy SPAs (React, Vue, Angular)
"browser_page_load_delay": 3.0Type: string | Default: "http://localhost:8080"
The URL of your SearXNG instance. If set, the web_search tool uses SearXNG for full Google dork operator support.
// Local SearXNG (default)
"searxng_url": "http://localhost:8080"
// Empty = DuckDuckGo fallback (limited operators, rate-limited)
"searxng_url": ""AIRecon auto-manages the SearXNG Docker container lifecycle (start on use, stop on exit). To start manually:
docker run -d --name searxng -p 8080:8080 searxng/searxngType: string | Default: "google,bing,duckduckgo,brave,google_news,github,stackoverflow"
Comma-separated list of engines to query via SearXNG.
// Full engine set (slower but broader)
"searxng_engines": "google,bing,duckduckgo,brave,startpage,github,stackoverflow,reddit,google_scholar,google_news"
// Fast subset for quick lookups
"searxng_engines": "google,bing,duckduckgo"Type: float | Default: 0.7
Jaccard similarity threshold for vulnerability deduplication. When a new vulnerability finding has similarity ≥ this value compared to an existing entry, it is merged rather than added as a duplicate.
| Value | Behavior |
|---|---|
0.9 |
Only near-identical findings are merged — more duplicates allowed |
0.7 |
Default — reasonable deduplication for most cases |
0.5 |
Aggressive deduplication — similar findings merged even if endpoint differs |
0.3 |
Very aggressive — not recommended |
Type: int | Default: 30
How many iterations between full plan revision checkpoints. At each checkpoint, the agent reviews all findings and updates its exploitation plan.
These control the minimum depth criteria required before a RECON → ANALYSIS phase transition is triggered.
Type: int | Default: 3
Minimum number of subdomains that must be discovered before RECON is considered complete. Prevents premature phase transition if the agent only finds 1–2 subdomains.
Type: int | Default: 1
Minimum number of URLs collected before RECON → ANALYSIS transition.
Type: int | Default: 30
Maximum RECON iterations before forcing a transition to ANALYSIS regardless of depth criteria. Prevents infinite RECON loops on targets with very limited attack surface.
Any config key can be overridden without editing the file using environment variables. Format: AIRECON_<KEY_UPPERCASE>.
# Override model
AIRECON_OLLAMA_MODEL=qwen3:32b airecon start
# Override temperature
AIRECON_OLLAMA_TEMPERATURE=0.2 airecon start
# Disable destructive testing
AIRECON_ALLOW_DESTRUCTIVE_TESTING=false airecon start
# Use a different Ollama endpoint
AIRECON_OLLAMA_URL=http://10.0.0.5:11434 airecon startType conversion rules:
bool: acceptstrue,1,yes→ True |false,0,no→ Falseint/float: standard numeric conversionstring: used as-is
Environment variables take precedence over the config file. They are applied at startup and do not persist.
{
"ollama_model": "qwen3:30b-a3b",
"ollama_num_ctx": 32768,
"ollama_num_ctx_small": 16384,
"ollama_temperature": 0.15,
"ollama_num_predict": 8192,
"ollama_enable_thinking": true,
"ollama_supports_thinking": true,
"ollama_supports_native_tools": true,
"command_timeout": 600.0,
"agent_max_tool_iterations": 300,
"searxng_url": "http://localhost:8080"
}Note:
qwen3:30b-a3bis a Mixture-of-Experts model — it has fewer active parameters than the full 30B, making it faster and more VRAM-efficient while retaining comparable reasoning quality.
{
"ollama_model": "qwen3:32b",
"ollama_num_ctx": 65536,
"ollama_num_ctx_small": 32768,
"ollama_temperature": 0.15,
"ollama_num_predict": 16384,
"ollama_enable_thinking": true,
"ollama_supports_thinking": true,
"ollama_supports_native_tools": true,
"ollama_keep_alive": "60m",
"command_timeout": 900.0,
"agent_max_tool_iterations": 800,
"searxng_url": "http://localhost:8080"
}{
"ollama_model": "qwen3.5:122b",
"ollama_num_ctx": 131072,
"ollama_num_ctx_small": 65536,
"ollama_temperature": 0.15,
"ollama_num_predict": 32768,
"ollama_enable_thinking": true,
"ollama_supports_thinking": true,
"ollama_supports_native_tools": true,
"ollama_timeout": 2400.0,
"ollama_keep_alive": "60m",
"command_timeout": 900.0,
"agent_max_tool_iterations": 800,
"searxng_url": "http://localhost:8080"
}{
"ollama_url": "http://192.168.1.100:11434",
"ollama_model": "qwen3.5:122b",
"ollama_timeout": 2400.0,
"ollama_num_ctx": 131072,
"ollama_num_ctx_small": 65536,
"ollama_temperature": 0.15,
"ollama_enable_thinking": true,
"ollama_supports_thinking": true,
"ollama_supports_native_tools": true,
"ollama_keep_alive": "60m",
"agent_max_tool_iterations": 800,
"searxng_url": "http://localhost:8080"
}{
"ollama_temperature": 0.15,
"allow_destructive_testing": false,
"deep_recon_autostart": false,
"command_timeout": 300.0,
"agent_max_tool_iterations": 100
}