Repository for shestorm - Vibe Coding Hackathon
#Problem Statement: π‘οΈ Real-Time Audio Fraud Detection for Scam Prevention
Social Cause Track Team SheStorm In a world where every voice can be cloned, intent cannot hide.
Team Name: SheStorm
View app : https://shestorm-ai-fraud-defender-73291669658.us-west1.run.app
Demo Link & PPT(PDF) : https://drive.google.com/drive/folders/1y_DknpPaxDXdqYZMj07zlWCOonYMsOap?usp=sharing
| Name | Role |
|---|---|
| Yamini | Frontend & UX |
| Ishani Gupta | Backend & API Development |
| Madhu Tiwari | AI / Machine Learning |
| Khushi Verma | Research, Documentation & Testing |
π This project was collaboratively researched, architected, and implemented by Team SHESTORM.
Voice fraud has evolved from simple scam calls into AI-powered psychological attacks.
By 2026:
- Voices can be cloned in seconds
- AI agents conduct full persuasive conversations
- Phone numbers are trivially spoofed
- Victims are manipulated emotionally, not technically
Yet most systems still ask:
βIs this voice fake?β
β This question is no longer sufficient.
We introduce a Real-Time Audio Fraud Detection System that focuses on detecting fraudulent intent and manipulative behavior during live conversations, before irreversible actions are taken.
- Call blockers rely on static number lists
- Voice authentication fails against human scammers
- Fraud detection is reactive
- Users are expected to βbe carefulβ
π Financial loss happens within seconds of answering a call.
Traditional systems focus on:
- Who is calling
- Whether the voice is real
- Whether the number is known
Our system focuses on:
- Why the caller is calling
- What they want the user to do
- How they manipulate emotions
π Numbers can be spoofed. π Voices can be cloned. π Intent cannot hide.
Scam calls follow repeatable behavioral patterns:
- Authority impersonation (bank, police, government)
- Artificial urgency
- Isolation tactics (βdonβt hang upβ)
- Scripted conversation flow
- Forced financial actions (OTP, transfer)
Our system detects these patterns in real time, even on first-time calls.
Analyzes:
- Authority phrases (bank, officer)
- Urgency (now, immediately)
- Financial intent (OTP, PIN)
- Isolation (donβt tell anyone)
π Detection is based on intent combinations, not keywords.
Detects:
- Rapid scripted speech
- Repetition
- Interruptions
- Dominant tone escalation
π Scammers follow scripts. Normal conversations adapt.
Detects:
- Stress mismatch
- Fear induction
- Aggression inconsistent with role
π A βbank agentβ using threats β High-risk indicator
- Depends on blacklists
- Easily bypassed
- Fails on first contact
- Ignores phone numbers
- Analyzes live dialogue
- Detects fraud immediately
π Fraud is revealed by conversation behavior, not caller ID.
| Competitor | Strength | Limitation | Our Advantage |
|---|---|---|---|
| Pindrop | Voice liveness | Misses intent | Intent + behavior |
| BioCatch | User behavior | Post-event detection | Real-time manipulation detection |
| Nuance | IVR security | High latency | Edge-ready system |
| Call Blockers | Number lists | Easily spoofed | Conversation intelligence |
Live Audio Stream β Acoustic Feature Extraction β Real-Time Transcription β Intent Analysis (NLP / LLM) β Behavior & Emotion Analysis β Risk Scoring Engine β User Alert & Prevention
- MFCC & spectrogram features
- Vocoder artifact detection
- Noise-robust preprocessing
- Streaming ASR
- Lightweight LLM / NLP classifier
- Intent categorization
- Speech cadence analysis
- Command repetition detection
- Multi-signal aggregation
- Continuous risk scoring
- Threshold-based alerts
Tech Stack
- Python (FastAPI)
- WebSockets for live streams
- REST APIs
Responsibilities
- Audio chunk ingestion
- Model inference
- Risk aggregation
- Alert triggering
- Event logging
Web Dashboard
- Live transcript
- Highlighted risky phrases
- Dynamic risk meter
- Alert notifications
Mobile UI (Concept)
- Floating warnings
- Vibration alerts
- Voice alerts
- Elder-friendly design
Database
- PostgreSQL / SQLite
Stores
- Call metadata
- Risk events
- Transcript snapshots
- Analytics logs
Optional:
- Vector DB for scam phrase embeddings
- Privacy constraints
- Scarcity of real scam calls
- Controlled experimentation
- Scam & normal scripts
- Emotional variations
- Noise profiles
- Multi-language samples
Labels:
0β Safe1β Scam
- Real-time detection
- No prior enrollment
- Works on first-time calls
- Detects human & AI scams
- Noise & accent tolerant
- Conversation-based intelligence
- Cross-platform fraud detection
- Telecom-level deployment
- AI watermark detection
- Multilingual support
- Regulatory compliance (GDPR/CCPA)
This contains everything which need to run app locally.
Prerequisites: Node.js
- Install dependencies: npm install
- Set the GEMINI_API_KEY in .env.local to your Gemini API key
- Run the app: npm run dev
Voice fraud is no longer an audio problem. It is a human manipulation problem.
This system acts as a Real-Time Conversation Firewall, protecting users before trust is exploited.