This project aims to build a fully offline, privacy-first fraud detection framework capable of identifying and mitigating AI-generated fraud threats including phishing, spoofing, pharming, credential exposure, deepfake scams, and model-level attacks such as prompt injection and jailbreaking.
The system operates entirely within a controlled environment using locally hosted models to ensure zero external data leakage.
With the rapid advancement of Generative AI, fraud techniques have evolved to include:
- AI-generated phishing emails
- Credential harvesting & exposure detection
- Website phishing & spoofing
- Pharming via malicious attachments
- Cookie manipulation attacks
- Deepfake voice scams targeting banking customers
- Prompt Injection & Jailbreaking
- Data poisoning & model exploitation
- Agentic AI data-leak vulnerabilities
This project provides a unified defense system against these threats.
- Detect AI-written phishing emails
- Identify exposed credentials
- Verify attachments for pharming risks
- Evaluate websites for phishing & spoofing
- Detect deepfake voice scams using MFCC models
- Identify cookie manipulation
- Counter model-level attacks
- Provide sandbox to test external AI agents
- Reduce false positives
- Provide explainable, human-readable outputs
User Input (Email / Website / Audio / Attachment / AI Agent)
│
▼
Preprocessing & Sanitization
│
▼
Multi-Layer Detection Engine
├── AI Text Detection (LLMs)
├── Phishing & Spoofing Models
├── Credential Exposure Scanner
├── Attachment Analysis Engine
├── Website Analyzer
├── MFCC Voice Deepfake Detector
├── Prompt Injection Detector
└── AI Sandbox Environment
│
▼
Risk Scoring Engine
│
▼
Human-Readable Explanation Output- Fine-tuned transformer models
- Stylometric & semantic analysis
- Synthetic phishing dataset augmentation
- AI-text probability scoring
- Regex + ML-based scanning
- Entropy-based secret detection
- API keys / tokens / password detection
- Feature anonymization
- Domain similarity detection (typosquatting)
- SSL certificate inspection
- HTML structure anomaly detection
- Cookie tampering detection
- Script injection scanning
- File type validation
- Hash-based reputation scanning
- Static malware heuristics
- Sandboxed execution environment
- MFCC feature extraction
- Spectrogram analysis
- CNN / LSTM audio classification
- Authenticity confidence scoring
- Malicious instruction pattern detection
- Context boundary enforcement
- Policy rule engine
- Behavior anomaly monitoring
- Isolated runtime container
- Controlled network restrictions
- Data leak simulation tests
- Agent behavior logging
- Red-team simulation testing
- Python
- FastAPI / Flask
- Docker
- REST APIs
- PyTorch / TensorFlow
- Hugging Face Transformers
- Scikit-learn
- Librosa (MFCC extraction)
- Ollama
- HuggingFace local models
- Docker sandbox
- Static file analysis tools
- Secure logging (anonymized)
- Enron Email Dataset
- Public phishing datasets (Kaggle / Hugging Face)
- Synthetic phishing dataset generated locally
- Deepfake voice datasets (offline public datasets)
- Fully offline operation
- No external API calls
- No raw sensitive data storage
- PII removed before training/logging
- Retain only anonymized features
- Encrypted local logging
- Secure model storage
Each analyzed input produces:
- Risk Tier:
Low / Medium / High / Critical - Confidence Score (0–100%)
- Threat Category
- Explanation Summary
- Recommended Action
- Human-in-the-loop validation
- False positive review mechanism
- Fraud pattern updates
- Offline retraining pipeline
- Adaptive risk scoring
- Cross-dataset validation
- Adversarial testing
- Prompt injection simulations
- Attachment sandbox stress testing
- False positive benchmarking
- Red-team attack simulation