Skip to content

Releases: JoasASantos/NeuroSploit

NeuroSploit v3.2.2 - Full LLM Pentest Mode

24 Feb 03:29

Choose a tag to compare

Full LLM Pentest Mode

New feature where the LLM drives the entire penetration test cycle autonomously — like a human pentester using Burp Suite / curl.

How it works

  1. User enters target URL in the Full LLM Pentest page
  2. The LLM receives the full methodology prompt + target
  3. LLM plans HTTP requests (up to 10 per round)
  4. System executes those requests and returns real responses
  5. LLM analyzes responses, identifies vulnerabilities, adapts strategy
  6. Repeat for up to 30 rounds across 4 phases

Phases

  • AI Recon (0-25%) — Technology fingerprinting, endpoint discovery, attack surface mapping
  • AI Testing (25-70%) — SQLi, XSS, LFI, Command Injection, SSRF, CSRF, IDOR, and more
  • Post-Exploitation (70-85%) — Vulnerability chaining, data extraction, privilege escalation
  • Report (85-100%) — Professional pentest report generation

Key Features

  • Anti-hallucination: Findings without real response evidence are automatically rejected
  • Full validation pipeline: All findings go through ValidationJudge (negative controls + proof of execution + confidence scoring)
  • Methodology injection: 118KB comprehensive pentest methodology (OWASP WSTG, PTES) injected into AI context
  • No Kali sandbox required: Uses system HTTP client directly
  • Any LLM provider: Works with Claude, GPT, Gemini, Ollama, LMStudio via SmartRouter

Files Changed

  • backend/core/autonomous_agent.py — New _run_full_llm_pentest() + helpers (+454 lines)
  • backend/core/vuln_engine/ai_prompts.py — 3 new prompt functions (+219 lines)
  • backend/api/v1/agent.py — New FULL_LLM_PENTEST mode
  • frontend/src/pages/FullIATestingPage.tsx — Updated UI for LLM-driven phases

NeuroSploit v3.2.1 - AI-Everywhere Auto Pentest

23 Feb 21:36

Choose a tag to compare

NeuroSploit v3.2.1

🤖 AI-Everywhere Auto Pentest

  • Pre-stream AI Master Plan: Strategic AI planning runs before parallel streams, producing target profile, priority vulns, recon guidance, and tool recommendations shared across all 3 streams
  • Stream 1 AI Recon Analysis: AI analyzes discovered endpoints for hidden surfaces, priority routing, and attack chain identification
  • Stream 2 AI Payload Generation: Context-aware AI-generated payloads replace hardcoded 3-payload approach, using master plan context, WAF info, and tech stack
  • Stream 3 AI Tool Analysis: AI classifies raw tool stdout/stderr into real findings vs noise, queues follow-up test endpoints

🧠 LLM-as-VulnEngine: AI Deep Testing

  • New _ai_deep_test() iterative loop: OBSERVE → PLAN → EXECUTE → ANALYZE → ADAPT (3 iterations max)
  • AI-first for top 15 injection types with hardcoded fallback
  • Per-endpoint AI testing with rich context (baseline, WAF, playbook, RAG, memory)
  • Anti-hallucination: all findings through ValidationJudge pipeline
  • Token budget adaptive: 15 calls normal, 5 when <50k tokens remain

🐛 Critical Container Fix

  • Root cause: ENTRYPOINT ["/bin/bash", "-c"] in Dockerfile conflicted with command="sleep infinity" → container exited immediately → all tools showed exit -1, 0.0s, 0 findings
  • Fix: Changed to CMD ["bash"] — all Kali sandbox tools (nuclei, naabu, etc.) now work correctly

🔍 Deep Recon Overhaul

  • JS analysis: 10→30 files, 11 regex patterns, source map (.map) parsing, parameter extraction
  • Sitemaps: recursive index following (depth 3), 8 candidates, 500 URL cap
  • API discovery: 7→20 Swagger/OpenAPI paths, 1→6 GraphQL paths, request body schema extraction
  • 9 framework detectors: WordPress (16 paths), Laravel, Django, Spring Boot, Express, ASP.NET, Rails, Next.js, Flask
  • 40+ hidden/sensitive paths checked (.env, .git, /actuator, /debug, /metrics, etc.)
  • API pattern fuzzing: infers endpoints from discovered patterns (37 common resources × CRUD variants)
  • HTTP method discovery via OPTIONS probing
  • URL normalization and deduplication

🎨 Frontend Improvements

  • Elapsed time now works for completed scans (computed from started_at → completed_at)
  • Container telemetry: exit -1 shows "ERR" (yellow), duration shows "N/A" on container failure
  • Professional HTML report: cover page, risk gauge, severity breakdown, table of contents, per-finding cards with evidence/PoC/confidence, print-friendly CSS

📊 Stats

  • +4,290 lines across 12 files
  • 4 new AI prompt builders: master_plan, junior_ai_test, tool_analysis, recon_analysis
  • 3 new deep recon methods: framework discovery, API fuzzing, method probing
  • Bug bounty training datasets included

Installation

git clone https://github.com/CyberSecurityUP/NeuroSploit.git
cd NeuroSploit
pip install -r requirements.txt
# Rebuild Kali sandbox image (IMPORTANT for container fix):
docker build -f docker/Dockerfile.kali -t neurosploit-kali:latest docker/

Full Changelog: v3.2...v3.2.1

NeuroSploit v3.0.0

15 Feb 01:15
43d892e

Choose a tag to compare

NeuroSploit v3.0.0 — Release Notes

Release Date: February 2026
Codename: Autonomous Pentester
License: MIT


Overview

NeuroSploit v3 is a ground-up overhaul of the AI-powered penetration testing platform. This release transforms the tool from a scanner into an autonomous pentesting agent — capable of reasoning, adapting strategy in real-time, chaining exploits, validating findings with anti-hallucination safeguards, and executing tools inside isolated Kali Linux containers.

By the Numbers

Metric Count
Vulnerability types supported 100
Payload libraries 107
Total payloads 477+
Kali sandbox tools 55
Backend core modules 63 Python files
Backend core code 37,546 lines
Autonomous agent 7,592 lines
AI decision prompts 100 (per-vuln-type)
Anti-hallucination prompts 12 composable templates
Proof-of-execution rules 100 (per-vuln-type)
Known CVE signatures 400
EOL version checks 19
WAF signatures 16
WAF bypass techniques 12
Exploit chain rules 10+
Frontend pages 14
API endpoints 111+
LLM providers supported 6

Architecture

                      +---------------------+
                      |   React/TypeScript   |
                      |     Frontend (14p)   |
                      +----------+----------+
                                 |
                           WebSocket + REST
                                 |
                      +----------v----------+
                      |   FastAPI Backend    |
                      |   14 API routers     |
                      +----------+----------+
                                 |
              +---------+--------+--------+---------+
              |         |        |        |         |
         +----v---+ +---v----+ +v------+ +v------+ +v--------+
         | LLM    | | Vuln   | | Agent | | Kali  | | Report  |
         | Manager| | Engine | | Core  | |Sandbox| | Engine  |
         | 6 provs| | 100typ | |7592 ln| | 55 tl | | 2 fmts  |
         +--------+ +--------+ +-------+ +-------+ +---------+

Stack: Python 3.10+ / FastAPI / SQLAlchemy (async) / React 18 / TypeScript / Tailwind CSS / Vite / Docker


Core Engine: 100 Vulnerability Types

The vulnerability engine covers 100 distinct vulnerability types organized in 10 categories with dedicated testers, payloads, AI prompts, and proof-of-execution rules for each.

Categories & Types

Category Types Examples
Injection 12 SQLi (error, union, blind, time-based), Command Injection, SSTI, NoSQL, LDAP, XPath, Expression Language, HTTP Parameter Pollution
XSS 3 Reflected, Stored (two-phase form+display), DOM-based
Authentication 7 Auth Bypass, JWT Manipulation, Session Fixation, Weak Password, Default Credentials, 2FA Bypass, OAuth Misconfig
Authorization 5 IDOR, BOLA, BFLA, Privilege Escalation, Mass Assignment, Forced Browsing
Client-Side 9 CORS, Clickjacking, Open Redirect, DOM Clobbering, PostMessage, WebSocket Hijack, Prototype Pollution, CSS Injection, Tabnabbing
File Access 5 LFI, RFI, Path Traversal, XXE, File Upload
Request Forgery 3 SSRF, SSRF Cloud (AWS/GCP/Azure metadata), CSRF
Infrastructure 7 Security Headers, SSL/TLS, HTTP Methods, Directory Listing, Debug Mode, Exposed Admin, Exposed API Docs, Insecure Cookies
Advanced 9 Race Condition, Business Logic, Rate Limit Bypass, Type Juggling, Timing Attack, Host Header Injection, HTTP Smuggling, Cache Poisoning, CRLF
Data Exposure 6 Sensitive Data, Information Disclosure, API Key Exposure, Source Code Disclosure, Backup Files, Version Disclosure
Cloud & Supply Chain 6 S3 Misconfig, Cloud Metadata, Subdomain Takeover, Vulnerable Dependency, Container Escape, Serverless Misconfig

Injection Routing

Every vulnerability type is routed to the correct injection point:

  • Parameter injection (default): SQLi, XSS, IDOR, SSRF, etc.
  • Header injection: CRLF, Host Header, HTTP Smuggling
  • Body injection: XXE
  • Path injection: Path Traversal, LFI
  • Both (param + path): LFI, directory traversal variants

XSS Pipeline (Reflected)

The reflected XSS engine is a multi-stage pipeline:

  1. Canary probe — unique marker per endpoint+param to detect reflection
  2. Context analysis — 8 contexts: html_body, attribute_value, script_string, script_block, html_comment, url_context, style_context, event_handler
  3. Filter detection — batch probe to map allowed/blocked chars, tags, events
  4. AI payload generation — LLM generates context-aware bypass payloads
  5. Escalation payloads — WAF/encoding bypass variants
  6. Testing — up to 30 payloads per param with per-payload dedup
  7. Browser validation — Playwright popup/cookie/DOM/event verification (optional)

POST Form Support

  • HTML forms detected during recon with method, action, all input fields (including <select>, <textarea>, hidden fields)
  • POST form testing includes all form fields (CSRF tokens, hidden inputs) — not just the parameter under test
  • Redirect following for POST responses (search forms that redirect to results)
  • Full HTTP method support: GET, POST, PUT, DELETE, PATCH, OPTIONS, HEAD

Autonomous Agent Architecture

3-Stream Parallel Auto-Pentest

The agent runs 3 concurrent streams via asyncio.gather():

Stream 1: Recon          Stream 2: Junior Tester      Stream 3: Tool Runner
  - Crawl target           - Immediate target test       - Nuclei + Naabu
  - Extract forms           - Consume endpoint queue      - AI-selected tools
  - JS analysis             - 3 payloads/endpoint         - Dynamic install
  - Deep fingerprint        - AI-prioritized types        - Process findings
  - Push to queue           - Skip tested types           - Feed back to recon
        |                         |                             |
        +----------+--------------+-----------------------------+
                   |
            Deep Analysis (50-75%)
            Researcher AI (75%)    ← NEW
            Finalization (75-100%)

Reasoning Engine (ReACT)

AI reasoning at strategic checkpoints (50%, 75%):

  • Think: analyze situation, available data, findings so far
  • Plan: recommend next actions, prioritize vuln types
  • Reflect: evaluate results, adjust strategy

Token budget tracking with graceful degradation:

  • 0-60% budget: full AI (reasoning + verification + enhancement)
  • 60-80%: reduced (skip enhancement)
  • 80-95%: minimal (verification only)
  • 95%+: technical only (no AI calls)

Strategy Adaptation

  • Dead endpoint detection: skip after 5+ consecutive errors
  • Diminishing returns: reduce testing on low-yield endpoints
  • Priority recomputation: re-rank vuln types based on results
  • Pattern propagation: IDOR on /users/1 automatically queues /orders/1, /accounts/1
  • Checkpoint refinement: at 30%/60%/90% refine attack strategy

Exploit Chaining

10+ chain rules for multi-step attack paths:

  • SSRF -> Internal service access -> Data extraction
  • SQLi -> Database-specific escalation (MySQL, PostgreSQL, MSSQL)
  • XSS -> Session hijacking -> Account takeover
  • LFI -> Source code disclosure -> Credential extraction
  • Auth bypass -> Privilege escalation -> Admin access

AI-driven chain discovery during finalization phase.


Validation & Anti-Hallucination Pipeline

4-Layer Verification

Every finding passes through 4 independent verification layers before confirmation:

Finding Signal
    |
    v
[1] Negative Controls  — Send benign/empty probes. Same response = false positive (-60 penalty)
    |
    v
[2] Proof of Execution — Per-vuln-type proof checks (25+ methods). XSS: context analyzer.
    |                      SSRF: metadata markers. SQLi: DB error patterns. Score 0-60.
    v
[3] AI Interpretation  — LLM analyzes with anti-hallucination system prompt + per-type
    |                      proof requirements. Speculative language rejected.
    v
[4] Confidence Scorer  — Numeric 0-100 score. >=90 confirmed, >=60 likely, <60 rejected.
    |
    v
ValidationJudge (sole authority for finding approval)

Anti-Hallucination System Prompts

12 composable anti-hallucination prompt templates injected into all 17 LLM call sites:

Prompt Purpose
anti_hallucination Core: never claim vuln without concrete proof
anti_scanner Don't behave like a scanner — reason like a pentester
negative_controls Explain control test methodology
think_like_pentester Manual testing mindset
proof_of_execution What constitutes real proof per vuln type
frontend_backend_correlation Don't confuse client-side vs server-side
multi_phase_tests Two-phase testing (submit + verify)
final_judgment Conservative final decision framework
confidence_score Numeric scoring calibration
anti_severity_inflation Don't inflate severity
operational_humility Acknowledge uncertainty
access_control_intelligence Data comparison, not status code diff

100 per-vuln-type proof requirements (e.g., SSRF requires metadata content, not just status diff).

Cross-Validation

  • _cross_validate_ai_claim() — independent check for XSS, SQLi, SSRF, IDOR, open redirect, CRLF, XXE, NoSQL
  • _evidence_in_response() — verify AI claim matches actual HTTP response
  • Speculative language rejection ("might be", "could be", "possibly")
  • Default False — findings rejected unless positively proven

Access Control Intelligence

  • BOLA/BFLA/IDOR use ...
Read more

NeuroSploitv2 - v1.2.0

14 Jan 19:23
5e73003

Choose a tag to compare

📘 Summary of Changes

The README has been updated with the following improvements and additions compared to the previous version (v2.2):

🆕 New or Expanded Sections

  • Adaptive AI Mode described with more detail in workflow and features.
  • 3 Execution Modes (CLI, Interactive, Experience/Wizard) clearly outlined with examples.
  • Consolidated Recon & Context-Based Analysis sections expanded, explaining how reconnaissance outputs are merged and reused without redundant tool runs.
  • LLM Providers & Profiles documentation expanded — listing support for multiple providers and how profiles are configured.
  • Agent Roles section expanded with examples of built-in roles and custom agent creation steps.

🛠 Improvements in Documentation

  • Installation instructions added clarity, including prerequisites, environment setup, and example commands.
  • Quick Start examples now include recommended workflows (Wizard, Two-Step Workflow, Interactive).
  • Detailed CLI Reference section was refined, showing flags, options, and usage patterns.
  • Reconnaissance & Tool Usage details improved with descriptions of included tools and execution.
  • Output Files & Reporting explained with output types (JSON, context, HTML), including report features like charts and summaries.

📜 Structural & Content Enhancements

  • Expanded Workflow Diagrams and Examples to guide users through typical recon → AI analysis → reporting flows.
  • Added Security Notice and responsible usage guidance in README to emphasize authorized testing only.
  • More comprehensive Architecture Overview listing directory structure and key components.

✨ Key Improvements

Improved adaptive intelligence descriptions to clarify how NeuroSploit decides when to run tools vs. AI analysis.

Documentation now includes more agent examples and explains how to customize capabilities via prompts.

Overall documentation flow has been made more user-friendly for both beginners and advanced users.

🐛 Bug & Docs Fixes

Fixed typos and improved consistency in command examples across sections.

Resolved ambiguities in installation steps and environment variable guidance.

NeuroSploitv2 - v1.1.0

12 Jan 12:05
866bb45

Choose a tag to compare

🚀 NeuroSploitv2 - v1.1.0

This release introduces NeuroSploitv2, an AI-powered penetration testing framework designed to automate and enhance offensive security operations using specialized agent roles and flexible large language model integration. The project focuses on combining structured automation, AI-assisted reasoning, and real-world security tooling while maintaining strong ethical guardrails and operational safety principles

✨ Key Features

Modular AI agent roles for Red Team, Blue Team, Bug Bounty, Malware Analysis, and more
Support for multiple LLM providers (Gemini, Claude, GPT, Ollama, LM Studio) with per-agent profiles
Markdown-based prompt system enabling contextual and role-specific AI behavior
Hallucination mitigation strategies, guardrails, and safety checks
Tool chaining for complex reconnaissance and attack workflows

🧠 AI & Automation Capabilities

Granular LLM profiles with control over model, temperature, token limits, caching, and context
Agent-based permission system defining allowed tools per role
Interactive CLI mode and direct command-line execution
AI-assisted planning, analysis, and reporting

🛠️ Built-in Tooling

Reconnaissance modules (OSINT collection, subdomain discovery, DNS enumeration)
Lateral movement helpers (SMB and SSH)
Persistence modules for Linux (cron) and Windows (registry)
Secure execution of external tools such as Nmap, Metasploit, Subfinder, Nuclei, SQLMap, and others

📊 Output & Reporting

Structured JSON campaign results
Automatically generated, human-readable HTML reports
Detailed logging and error handling

NeuroSploitv2

03 Jan 03:39
411627a

Choose a tag to compare

🚀 NeuroSploitv2 - v1.0.0

This release introduces NeuroSploitv2, an AI-powered penetration testing framework designed to automate and enhance offensive security operations using specialized agent roles and flexible large language model integration. The project focuses on combining structured automation, AI-assisted reasoning, and real-world security tooling while maintaining strong ethical guardrails and operational safety principles

✨ Key Features

  • Modular AI agent roles for Red Team, Blue Team, Bug Bounty, Malware Analysis, and more
  • Support for multiple LLM providers (Gemini, Claude, GPT, Ollama, LM Studio) with per-agent profiles
  • Markdown-based prompt system enabling contextual and role-specific AI behavior
  • Hallucination mitigation strategies, guardrails, and safety checks
  • Tool chaining for complex reconnaissance and attack workflows

🧠 AI & Automation Capabilities

  • Granular LLM profiles with control over model, temperature, token limits, caching, and context
  • Agent-based permission system defining allowed tools per role
  • Interactive CLI mode and direct command-line execution
  • AI-assisted planning, analysis, and reporting

🛠️ Built-in Tooling

  • Reconnaissance modules (OSINT collection, subdomain discovery, DNS enumeration)
  • Lateral movement helpers (SMB and SSH)
  • Persistence modules for Linux (cron) and Windows (registry)
  • Secure execution of external tools such as Nmap, Metasploit, Subfinder, Nuclei, SQLMap, and others

📊 Output & Reporting

  • Structured JSON campaign results
  • Automatically generated, human-readable HTML reports
  • Detailed logging and error handling