Compare LLM vs BERT for phishing email detection — with Flask web UI, batch analysis, and visualization tools.
| Feature | Description |
|---|---|
| Dual Model Support | Llama-Phishsense-1B (LLM) vs BERT-finetuned-phishing |
| Web GUI | Flask-based interface for real-time email analysis |
| Batch Analysis | Process JSONL datasets with full metrics |
| Visualization | ROC curves, confusion matrices, model comparisons |
| GPU Acceleration | CUDA support out of the box |
| Metric | BERT | Llama |
|---|---|---|
| Accuracy | 99.0% | TBD |
| Precision | 100% | TBD |
| Recall | 98.0% | TBD |
| F1-Score | 98.9% | TBD |
| ROC-AUC | 0.99 | TBD |
| Metric | BERT | Llama |
|---|---|---|
| Accuracy | 55.0% | 52.5% |
| Precision | 52.6% | 51.3% |
| Recall | 100% | 100% |
| F1-Score | 68.9% | 67.8% |
💡 Key Insight: Both models show high recall (catch all phishing) but lower precision on real-world data, suggesting more diverse training data is needed.
.
├── bert-finetuned-phishing/ # BERT model implementation
│ ├── full_analysis_with_bert_model.py
│ ├── webgui.py
│ └── results_*/ # Evaluation results
│
├── llama_phish_demo/ # Llama model implementation
│ ├── full_analysis_with_llama_model.py
│ ├── webgui.py
│ └── results_*/ # Evaluation results
│
├── examples/ # Test datasets (JSONL format)
│ ├── phishing_mails.jsonl
│ ├── valid_mails.jsonl
│ └── kaggle_*.jsonl
│
├── comparisons/ # Model comparison visualizations
│ ├── kaggle/
│ ├── realworld_data/
│ └── ai_generated/
│
├── comparison_visualizer.py # Generate comparison charts
├── emailstojsonl.py # CSV to JSONL converter
└── requirements.txt
git clone https://github.com/YOUR_USERNAME/Ollama-Phishing-Framework.git
cd Ollama-Phishing-Framework
pip install -r requirements.txtBERT Model (lighter, faster):
cd bert-finetuned-phishing
python webgui.py
# Open http://localhost:5000Llama Model (requires HuggingFace token):
cd llama_phish_demo
python webgui.py
# Open http://localhost:5001# BERT analysis
cd bert-finetuned-phishing
python full_analysis_with_bert_model.py
# Llama analysis
cd llama_phish_demo
python full_analysis_with_llama_model.pypython comparison_visualizer.pyAfter running comparison_visualizer.py, you'll get:
JSONL with two fields:
{"email_type": "phishing", "content": "Your account has been compromised..."}
{"email_type": "valid", "content": "Meeting reminder for tomorrow at 3pm"}Convert CSV to JSONL:
python emailstojsonl.pyPRs welcome!
- Add more models (GPT, Claude, etc.)
- Improve real-world accuracy
- Add email header analysis
- Docker support
- API endpoint mode
| Model | Source | Author |
|---|---|---|
| Llama-Phishsense-1B | HuggingFace | AcuteShrewdSecurity |
| bert-finetuned-phishing | HuggingFace | E. Alvarado |
Educational purposes only. Do not use for malicious activities.
MIT
⭐ Star this repo if you find it useful!


