AI-Driven Defense and Monitoring Platform for VNC Data Exfiltration
SentinelVNC detects and contains data exfiltration attacks in VNC sessions through hybrid rule-based and ML detection, with blockchain-anchored forensic evidence.
SentinelVNC monitors VNC (Virtual Network Computing) sessions for:
- Clipboard Abuse: Large clipboard operations indicating data exfiltration
- Screenshot Scraping: Rapid screenshot capture patterns
- File Exfiltration: Unusual file transfer activities
The system uses a hybrid approach combining:
- Rule-based detection (3 core rules with low false-positive rates)
- ML-based anomaly detection (RandomForest with SHAP explainability)
- Blockchain anchoring (Merkle tree-based forensic evidence)
- Python 3.10+ (3.11 preferred, but 3.10+ works)
- Linux/macOS (tested on macOS, should work on Linux)
- 2GB+ RAM
- Internet connection (for initial package installation)
# Clone or navigate to the repository
cd /path/to/SentinelVNC
# Create virtual environment
python3 -m venv venv # or python3.11 if available
# Activate virtual environment
source venv/bin/activate # On macOS/Linux
# OR
venv\Scripts\activate # On Windows
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Create necessary directories
mkdir -p data/synthetic models logs forensic anchors# Make script executable
chmod +x run_demo.sh
# Run the demo (trains model, simulates attacks, detects, anchors, launches dashboard)
./run_demo.shThe script will:
- Train the ML model (if not already trained)
- Clear old simulation data
- Generate synthetic attack events
- Run the detector to identify threats
- Create blockchain anchors from forensic evidence
- Launch the Streamlit dashboard
SentinelVNC/
├── attack_simulator.py # Generates synthetic VNC attack events
├── detector.py # Hybrid rule-based + ML detection engine
├── train_model.py # ML model training with SHAP
├── streamlit_app.py # Real-time monitoring dashboard
├── merkle_anchor.py # Blockchain anchoring (Merkle tree)
├── run_demo.sh # End-to-end demo orchestration
├── requirements.txt # Python dependencies
├── README.md # This file
├── DEMO_SCRIPT.md # Demo presentation script
├── SLIDES.md # 6-slide presentation outline
├── FAQ.md # FAQ for judges
├── DEVELOPMENT_PLAN.md # Development plan
├── data/
│ └── synthetic/ # Generated attack events
├── models/ # Trained ML models
├── logs/ # Detection alerts
├── forensic/ # Forensic JSON records
└── anchors/ # Blockchain anchor files
Generates synthetic VNC events to simulate attacks:
Scenarios:
normal: Normal user activityclipboard_abuse: Large clipboard operationsscreenshot_scraping: Rapid screenshot capturefile_exfiltration: Large file transfersmixed: Combination of all attacks
Usage:
python attack_simulator.pyOutput: data/synthetic/vnc_events.jsonl (JSONL format, one event per line)
Hybrid detection engine with 3 core rules:
Rule 1: Clipboard Size Threshold
- Alerts if clipboard operation > 200KB
- Reason: Large clipboard operations indicate bulk data exfiltration
Rule 2: Screenshot Burst
- Alerts if 5+ screenshots within 10 seconds
- Reason: Rapid screenshot capture suggests scraping
Rule 3: File Transfer
- Alerts if file > 50MB OR 2+ large files within 30 seconds
- Reason: Unusual file transfer patterns
ML Detection:
- Uses trained RandomForest model
- Anomaly score threshold: 0.5
- Features: event type, sizes, temporal patterns, history
Usage:
python detector.pyOutput:
logs/alerts.jsonl: All detected alertsforensic/*.json: Forensic records for each alert
Trains a lightweight RandomForest classifier:
Features:
- Event type encoding (clipboard/screenshot/file_transfer)
- Size features (normalized)
- Temporal features (time of day)
- History features (recent activity counts)
Explainability:
- SHAP values for feature importance
- Feature importance rankings
- Saved to
models/shap_data.json
Usage:
python train_model.pyOutput:
models/detection_model.pkl: Trained modelmodels/model_metadata.json: Model metadatamodels/shap_data.json: SHAP explainability data
Real-time monitoring dashboard with:
- Live alerts feed
- Detection analysis (charts and statistics)
- Forensic timeline
- Blockchain anchors viewer
- Containment button (simulated)
Usage:
streamlit run streamlit_app.pyAccess: Dashboard opens at http://localhost:8501
Creates Merkle tree from forensic events:
Process:
- Collects all forensic JSON files
- Computes SHA-256 hash of each file
- Builds Merkle tree
- Generates root hash
- Signs anchor with signature hash
Usage:
python merkle_anchor.pyOutput: anchors/*.json (anchor metadata with Merkle root)
Verification:
from merkle_anchor import ForensicAnchoring
anchorer = ForensicAnchoring()
anchorer.verify_anchor(Path("anchors/ANCHOR_123.json"))python attack_simulator.py
# Check: data/synthetic/vnc_events.jsonl# First generate events
python attack_simulator.py
# Then run detector
python detector.py
# Check: logs/alerts.jsonl, forensic/*.jsonpython train_model.py
# Check: models/detection_model.pkl# First generate alerts (creates forensic files)
python attack_simulator.py
python detector.py
# Then create anchor
python merkle_anchor.py
# Check: anchors/*.json-
Setup (30 seconds)
- Show project structure
- Explain hybrid detection approach
-
Attack Simulation (20 seconds)
- Run
attack_simulator.pywith mixed scenario - Show generated events
- Run
-
Detection (30 seconds)
- Run
detector.py - Show alerts with explainable reasons
- Highlight rule-based + ML detection
- Run
-
Forensic Anchoring (20 seconds)
- Run
merkle_anchor.py - Show Merkle root and verification
- Run
-
Dashboard (30 seconds)
- Launch Streamlit dashboard
- Show live alerts, analysis, anchors
- Demonstrate containment button
Total: ~2 minutes
- Simulated attacks only: All attack patterns are synthetic and benign
- No real VNC data: System works with simulated events
- Air-gapped compatible: No cloud dependencies, runs entirely locally
- Forensic integrity: Merkle tree ensures evidence tamper-proofing
Solution: Run python train_model.py first
Solution: Ensure events are generated: python attack_simulator.py
Solution: Run the full demo: ./run_demo.sh
Solution: Ensure virtual environment is activated and requirements installed
Solution: chmod +x run_demo.sh
- Model training: ~10-30 seconds (2000 samples)
- Detection latency: <100ms per event
- Dashboard refresh: 1-5 seconds (configurable)
- Memory usage: ~200-500MB
See:
DEMO_SCRIPT.md: Step-by-step demo scriptFAQ.md: Answers to common questions
Priyanshu Mishra
- scikit-learn for ML capabilities
- Streamlit for dashboard framework
- SHAP for explainability