Skip to content

5hriinidhi/hack-o-hire

Repository files navigation

🛡️ AI-Generated Fraud Detection & Mitigation System

📌 Overview

This project aims to build a fully offline, privacy-first fraud detection framework capable of identifying and mitigating AI-generated fraud threats including phishing, spoofing, pharming, credential exposure, deepfake scams, and model-level attacks such as prompt injection and jailbreaking.

The system operates entirely within a controlled environment using locally hosted models to ensure zero external data leakage.


🚨 Problem Statement

With the rapid advancement of Generative AI, fraud techniques have evolved to include:

  • AI-generated phishing emails
  • Credential harvesting & exposure detection
  • Website phishing & spoofing
  • Pharming via malicious attachments
  • Cookie manipulation attacks
  • Deepfake voice scams targeting banking customers
  • Prompt Injection & Jailbreaking
  • Data poisoning & model exploitation
  • Agentic AI data-leak vulnerabilities

This project provides a unified defense system against these threats.


🎯 Objectives

  • Detect AI-written phishing emails
  • Identify exposed credentials
  • Verify attachments for pharming risks
  • Evaluate websites for phishing & spoofing
  • Detect deepfake voice scams using MFCC models
  • Identify cookie manipulation
  • Counter model-level attacks
  • Provide sandbox to test external AI agents
  • Reduce false positives
  • Provide explainable, human-readable outputs

🏗️ System Architecture

User Input (Email / Website / Audio / Attachment / AI Agent)
│
▼
Preprocessing & Sanitization
│
▼
Multi-Layer Detection Engine
├── AI Text Detection (LLMs)
├── Phishing & Spoofing Models
├── Credential Exposure Scanner
├── Attachment Analysis Engine
├── Website Analyzer
├── MFCC Voice Deepfake Detector
├── Prompt Injection Detector
└── AI Sandbox Environment
│
▼
Risk Scoring Engine
│
▼
Human-Readable Explanation Output

🧠 Core Modules

1️⃣ AI Phishing Email Detection

  • Fine-tuned transformer models
  • Stylometric & semantic analysis
  • Synthetic phishing dataset augmentation
  • AI-text probability scoring

2️⃣ Credential Exposure Detection

  • Regex + ML-based scanning
  • Entropy-based secret detection
  • API keys / tokens / password detection
  • Feature anonymization

3️⃣ Website Phishing & Spoofing Analyzer

  • Domain similarity detection (typosquatting)
  • SSL certificate inspection
  • HTML structure anomaly detection
  • Cookie tampering detection
  • Script injection scanning

4️⃣ Attachment Verification (Pharming Protection)

  • File type validation
  • Hash-based reputation scanning
  • Static malware heuristics
  • Sandboxed execution environment

5️⃣ Deepfake Voice Scam Detection

  • MFCC feature extraction
  • Spectrogram analysis
  • CNN / LSTM audio classification
  • Authenticity confidence scoring

6️⃣ Prompt Injection & Jailbreak Defense

  • Malicious instruction pattern detection
  • Context boundary enforcement
  • Policy rule engine
  • Behavior anomaly monitoring

7️⃣ AI Agent Sandbox Environment

  • Isolated runtime container
  • Controlled network restrictions
  • Data leak simulation tests
  • Agent behavior logging
  • Red-team simulation testing

⚙️ Technology Stack

Core Backend

  • Python
  • FastAPI / Flask
  • Docker
  • REST APIs

Machine Learning

  • PyTorch / TensorFlow
  • Hugging Face Transformers
  • Scikit-learn
  • Librosa (MFCC extraction)

Local LLM Hosting

  • Ollama
  • HuggingFace local models

Security & Isolation

  • Docker sandbox
  • Static file analysis tools
  • Secure logging (anonymized)

📊 Datasets

  • Enron Email Dataset
  • Public phishing datasets (Kaggle / Hugging Face)
  • Synthetic phishing dataset generated locally
  • Deepfake voice datasets (offline public datasets)

🔐 Privacy & Security Principles

  • Fully offline operation
  • No external API calls
  • No raw sensitive data storage
  • PII removed before training/logging
  • Retain only anonymized features
  • Encrypted local logging
  • Secure model storage

📈 Risk Scoring Framework

Each analyzed input produces:

  • Risk Tier: Low / Medium / High / Critical
  • Confidence Score (0–100%)
  • Threat Category
  • Explanation Summary
  • Recommended Action

🔄 Continuous Learning

  • Human-in-the-loop validation
  • False positive review mechanism
  • Fraud pattern updates
  • Offline retraining pipeline
  • Adaptive risk scoring

🧪 Testing Strategy

  • Cross-dataset validation
  • Adversarial testing
  • Prompt injection simulations
  • Attachment sandbox stress testing
  • False positive benchmarking
  • Red-team attack simulation

About

barclays hackathon

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 2

  •  
  •  

Languages