-
-
Notifications
You must be signed in to change notification settings - Fork 562
Description
Name
LocalLLMAnalyzer (or LLMSummarizer)
Link
- Ollama: https://ollama.com (primary recommended backend – easy local setup, many models)
- LocalAI: https://localai.io (alternative for more backends)
- Optional: Hugging Face
transformerspipelines for pure-Python local inference (no extra container)
Type of analyzer
Observable (primary – works on IPs, domains, URLs, hashes, etc.)
File (future extension – e.g. summarize deobfuscated strings, script analysis, or multi-analyzer file reports)
Docker (optional – if we bundle a lightweight Ollama container for easier deployment)
Why should we use it
IntelOwl already aggregates rich data from 100+ analyzers (VirusTotal, GreyNoise, Shodan, Abuse.ch, YARA, CAPA, Intezer, etc.), but analysts often spend significant time manually reading and correlating verbose/conflicting reports to understand the real threat context.
This analyzer would use a self-hosted / local Large Language Model (no data sent to external providers like OpenAI → full privacy for sensitive investigations) to:
- Generate concise, natural-language summaries of multi-analyzer results
- Classify threat type (phishing, ransomware, C2, credential-theft, etc.) and assign a normalized risk score
- Extract additional entities / IOCs (e.g. targeted brands, wallet addresses, TTPs)
- Provide actionable insights and suggested pivots / next steps
- Reduce analyst triage time dramatically while preserving drill-down to raw data
Benefits:
- Privacy-first (critical for SOC/CTI teams)
- Enhances usability: turns raw intel into readable reports for tickets, alerts, executives
- Differentiates IntelOwl as a next-gen TI platform with applied AI
- Aligns perfectly with Honeynet/IntelOwl interest in applied AI
- Extensible: custom prompts, model switching, fine-tuning support later
- Complements existing analyzers without replacing them
This builds on patterns from analyzers like Intezer (behavioral insights) or recent additions (Phunter, etc.), but adds semantic understanding.
Possible implementation
High-level architecture
- New Python class inheriting from
ObservableAnalyzer(and laterFileAnalyzer) - Connects to local LLM backend via API (Ollama default: http://localhost:11434)
- Input: aggregated analyzer reports (from job results, easily accessible via
self.report) - Prompt engineering: structured system prompt + few-shot examples to reduce hallucinations and enforce output format
- Output: parsed into IntelOwl-standard fields
summary: main text paragraphtags: auto-added (e.g.ai:malicious,ai:phishing, severity levels)extra_data: JSON withrisk_score(0-100),threat_categories,confidence,extracted_iocs,suggested_pivots
- Runs async via Celery (like slow analyzers: VirusTotal, URLscan)
- Secrets: store LLM endpoint / API key (if any) via existing secrets management
Phased / Minimal Viable Implementation
- Support Ollama only (simplest: REST API, no auth by default)
- Focus on observable analysis
- Fixed prompt template: