AI-Driven Vulnerability Scanner & Remediation System

Project Overview

This repository contains an AI-driven vulnerability scanning and remediation system. Given a user-provided IP address, the system performs:

Network Scanning: Uses Nmap to discover open TCP ports and service versions.
Vulnerability Matching: Utilizes SBERT embeddings to semantically match discovered services with relevant CVE entries (2023–2025) from the NVD.
Severity Classification: Applies a RandomForest classifier (trained on CVE descriptions and CVSS scores) to categorize each CVE into Low, Medium, High, or Critical.
Remediation Generation: Fine-tunes a T5 model to generate customized remediation steps for each CVE description.
Web Interface: Provides a browser-based UI (HTML/CSS/JavaScript) that allows real-time scanning and remediation lookups.

Features

Real-Time Scanning: Enter an IP address and instantly scan for open services using Nmap.
CVE Prioritization: For each service, retrieve top-3 matching CVEs based on semantic similarity.
Severity Ranking: Classify CVEs by severity to help prioritize remediation.
AI-Generated Fixes: Generate human-readable remediation instructions using a fine-tuned T5 model.
User-Friendly UI: Interactive web interface with progress spinners and result tables.

Architecture

Frontend (Browser):
- HTML/CSS for layout and styling.
- JavaScript (app.js) to call backend APIs and render results.
- Spinner to indicate scan progress.
Backend (FastAPI):
- /api/scan: Receives { ip }, runs Nmap via subprocess, parses output, computes SBERT similarity against CVE embeddings, returns list of services with top-3 CVEs.
- /api/fix: Receives { description }, computes SBERT similarity against CVE descriptions, retrieves top-5 CVEs, generates T5 remediation.
- Models and embeddings are loaded at startup (@app.on_event("startup")).
Data & Models:
- CVE Dataset: data/processed/cve_full_dataset.csv containing CVE_ID, description, CVSS_score, attack_vector, severity, remediation_steps (for training).
- SBERT Embeddings: Precomputed 384-dim vectors for 80k CVE descriptions.
- RandomForest Classifier: Trained on SBERT embeddings to predict severity.
- T5 Remediation Model: Fine-tuned T5-small with safetensors under models/checkpoint-53500.

Setup & Installation

Prerequisites

Operating System: Windows 10/11, macOS, or Linux
Python: 3.10.x
Nmap: Latest version (v7.95+), ensure nmap or nmap.exe is on your PATH.
Git: For cloning the repository.

Clone & Install Dependencies

# Clone the repository
git clone https://github.com/YourUserName/ai-vuln-scanner.git
cd ai-vuln-scanner

# Create a Python virtual environment (optional but recommended)
python -m venv venv
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install required Python packages
pip install -r requirements.txt

Contents of requirements.txt:

fastapi
uvicorn
pandas
joblib
torch
sentence-transformers
transformers
scikit-learn
python-nmap
safetensors

Environment Setup

No additional environment variables are required. Ensure Nmap is installed and accessible:

nmap --version

Usage

Running the Server

From the project root:

python src/api/app.py

You will see console logs indicating:

1) Reading CSV…
2) Loading classifier & SBERT…
3) Computing SBERT embeddings…
4) Loading T5 tokenizer & remediation model…
5) Initializing Nmap scanner…
✅ Startup complete; service is ready.
INFO: Uvicorn running on http://0.0.0.0:5000

Web Interface

Open your browser and navigate to:

http://127.0.0.1:5000

You should see the AI-Driven Vulnerability Scanner UI.

Scan Workflow

Enter IP Address: In the input box labeled “Enter IP to scan”, type an IPv4 address (e.g., scanme.nmap.org, 127.0.0.1, or a local network IP).
Click “Scan”: A circular spinner appears, showing progress.
View Results: After Nmap completes (first 1000 ports, version detection), a table appears:

Port Service Top CVEs

22 ssh OpenSSH_7.4 CVE-2023-1234 (HIGH)
CVE-2023-2345 (MEDIUM) …

80 http Apache/2.4.29 CVE-2023-3456 (CRITICAL)
…

Remediation Workflow

Paste a CVE Description: In the “Get CVE Fixes” textarea, paste any CVE description (e.g., “Apache HTTP Server versions 2.4.29 …”).
Click “Get Fixes”: The spinner appears briefly as the T5 model generates fixes.
View Fix: A list appears with up to 5 CVE IDs, their severity, and the AI-generated remediation text:
- CVE-2023-3456 (CRITICAL): “Update Apache to 2.4.41 or later …”

Project Structure

ai-vuln-scanner/
├─ data/
│  └─ processed/
│      └─ cve_full_dataset.csv        # CSV used for training & lookup
├─ models/
│  ├─ checkpoint-53500/               # Fine-tuned T5 checkpoint & tokenizer
│  └─ severity_classifier.pkl         # RandomForest classifier
├─ src/
│  ├─ api/
│  │   └─ app.py                      # FastAPI application
│  ├─ integration/
│  │   └─ scan_and_map.py             # (Optional) scanner/CVE-mapping logic
│  ├─ remediation/
│  │   └─ retrieve_remediations.py    # Scripts to build dataset (not used at runtime)
│  ├─ training/
│  │   └─ train_models.py             # Model training scripts
│  └─ …other modules…
├─ frontend/
│  ├─ index.html                      # Web UI
│  ├─ styles.css                      # Spinner & layout styling
│  └─ app.js                          # JavaScript for scan & fix
├─ .gitignore                         # Excludes models, caches, etc.
├─ README.md                          # ← You’re reading this
└─ requirements.txt                   # Python dependencies

Model Training

If you need to retrain or update the models:

Ensure cve_full_dataset.csv is present: Contains CVE_ID, Description, CVSS, Severity, Remediation_Steps, etc.
Run Training Script:
```
python src/training/train_models.py
```
- Trains a RandomForest severity classifier (models/severity_classifier.pkl).
- Fine-tunes T5-small for remediation, saving to models/checkpoint-53500/.

Troubleshooting

Nmap not found:
- Ensure you installed Nmap from https://nmap.org/download.html and added it to PATH.
- Running nmap --version should print version info.
Port Scan Returns Empty:
- Check network/firewall settings.
- Try scanning a known host (e.g., 127.0.0.1 or scanme.nmap.org).
Spinner Stuck at 0%:
- Confirm app.js loaded appears in your browser’s console.
- Use Opera GX’s DevTools (Ctrl+Shift+I) → Console, Network to inspect /api/scan requests.
Model Loading Errors:
- Ensure models/checkpoint-53500/ contains config.json, spiece.model, pytorch_model.bin or model.safetensors.
- If you see a pickle/safetensors error, update app.py to use from_pretrained(..., local_files_only=True).

Future Improvements

Precompute & Cache Service Embeddings: Speed up per-scan CVE lookup.
Expand CVE Data Sources: Include ExploitDB, GitHub Security Advisories.
Enhance UI: Add pagination, filters (by severity), and CSV export.
Dockerize: Provide a Dockerfile for containerized deployment.
Risk Scoring: Incorporate asset context and business criticality into prioritization.

References

National Vulnerability Database (NVD): CVE JSON Feeds – https://nvd.nist.gov/feeds/json/cve/1.1/
Reimers & Gurevych (2019): Sentence-BERT – https://arxiv.org/abs/1908.10084
Raffel et al. (2020): Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer – https://arxiv.org/abs/1910.10683
python-nmap: Python wrapper for Nmap – https://pypi.org/project/python-nmap/
Hugging Face Transformers – https://huggingface.co/
Nmap Reference Guide – https://nmap.org/book/man.html

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
models		models
reports		reports
scan_results		scan_results
src		src
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
check_dependencies.py		check_dependencies.py
integration.py		integration.py
network_scanner.py		network_scanner.py
report_generator.py		report_generator.py
requirements.txt		requirements.txt
run.bat		run.bat
setup_models.py		setup_models.py
test_direct_nmap.py		test_direct_nmap.py
test_nmap.py		test_nmap.py
vulnerability_scanner.py		vulnerability_scanner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Driven Vulnerability Scanner & Remediation System

Table of Contents

Project Overview

Features

Architecture

Setup & Installation

Prerequisites

Clone & Install Dependencies

Environment Setup

Usage

Running the Server

Web Interface

Scan Workflow

Remediation Workflow

Project Structure

Model Training

Troubleshooting

Future Improvements

References

About

Uh oh!

Releases

Packages

Languages

Port	Service	Top CVEs
22	ssh OpenSSH_7.4	CVE-2023-1234 (HIGH) CVE-2023-2345 (MEDIUM) …
80	http Apache/2.4.29	CVE-2023-3456 (CRITICAL) …

addcontent/AI-Vulverability-Scanner

Folders and files

Latest commit

History

Repository files navigation

AI-Driven Vulnerability Scanner & Remediation System

Table of Contents

Project Overview

Features

Architecture

Setup & Installation

Prerequisites

Clone & Install Dependencies

Environment Setup

Usage

Running the Server

Web Interface

Scan Workflow

Remediation Workflow

Project Structure

Model Training

Troubleshooting

Future Improvements

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages