Skip to content

atf-inc/dec25_intern_B_security

Repository files navigation

MailShieldAI

AI-Powered Email Security Platform
Detect and neutralize phishing threats in real-time using a multi-agent architecture

Live DemoQuick StartArchitectureAPIDeployment

Live Demo

FastAPI Next.js React LangGraph PostgreSQL Redis Gemini

DeepWiki Documentation


Overview

MailShieldAI is an enterprise-grade email security platform that processes incoming emails
through a sophisticated pipeline of specialized AI agents. Each email is analyzed for phishing attempts,
malware, social engineering, and other threats, with automatic Gmail labeling based on risk assessment.

Key Capabilities

Feature Description
Multi-Agent Pipeline 5 specialized workers: Ingest → Intent → Sandbox → Aggregator → Action
LangGraph AI Analysis Advanced intent classification with 16 threat categories via Gemini AI
Real-Time Processing Gmail Pub/Sub integration for instant email analysis
Automated Labeling Auto-applies MailShield/SAFE, CAUTIOUS, or MALICIOUS labels
Risk Scoring Intelligent 0-100 scoring with confidence-weighted adjustments
Secure by Design OAuth 2.0, CORS validation, PII masking, rate limiting

Architecture

System Overview

                                    ┌─────────────────────────────────────┐
                                    │           USER INTERFACE            │
                                    │  ┌───────────────────────────────┐  │
                                    │  │   Next.js Dashboard (:3000)   │  │
                                    │  │   • Email monitoring          │  │
                                    │  │   • Threat visualization      │  │
                                    │  │   • Risk analytics            │  │
                                    │  └───────────────┬───────────────┘  │
                                    └─────────────────│─────────────────┘
                                                      │
                                                      ▼
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                                    API LAYER                                            │
│  ┌───────────────────────────────────────────────────────────────────────────────────┐  │
│  │                         FastAPI Backend (:8000)                                   │  │
│  │         OAuth • REST Endpoints • Email Ingestion • Statistics                     │  │
│  └───────────────────────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────────────────┘
                                              │
              ┌───────────────────────────────┼───────────────────────────────┐
              │                               │                               │
              ▼                               ▼                               ▼
┌─────────────────────────┐    ┌─────────────────────────┐    ┌─────────────────────────┐
│      DATA LAYER         │    │     MESSAGE BROKER      │    │    EXTERNAL SERVICES    │
│  ┌───────────────────┐  │    │  ┌───────────────────┐  │    │  ┌───────────────────┐  │
│  │    PostgreSQL     │  │    │  │   Redis Streams   │  │    │  │    Gmail API      │  │
│  │  • Email records  │  │    │  │  • Control Queue  │  │    │  │  • Pub/Sub events │  │
│  │  • User accounts  │  │    │  │  • Intent Done    │  │    │  │  • Label mgmt     │  │
│  │  • Analysis logs  │  │    │  │  • Analysis Done  │  │    │  │  • Email fetch    │  │
│  └───────────────────┘  │    │  │  • Final Report   │  │    │  └───────────────────┘  │
└─────────────────────────┘    │  └───────────────────┘  │    │  ┌───────────────────┐  │
                               └────────────┬────────────┘    │  │    Gemini AI      │  │
                                            │                 │  │  • Intent analysis│  │
                    ┌───────────────────────┼─────────────────│──│  • URL scanning   │  │
                    │                       │                 │  └───────────────────┘  │
                    ▼                       ▼                 └─────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                               AI WORKER PIPELINE                                        │
│                                                                                         │
│    ┌──────────┐      ┌──────────┐      ┌──────────┐      ┌──────────┐      ┌────────┐  │
│    │  INGEST  │ ───▶ │  INTENT  │ ───▶ │ SANDBOX  │ ───▶ │AGGREGATOR│ ───▶ │ ACTION │  │
│    │  :8001   │      │  :8002   │      │  :8004   │      │  :8005   │      │ :8003  │  │
│    └──────────┘      └──────────┘      └──────────┘      └──────────┘      └────────┘  │
│         │                 │                 │                  │                │      │
│    Pub/Sub ───▶    LangGraph ───▶    URL/File ───▶     Combine  ───▶    Gmail   │      │
│    Handler        Classification      Analysis         Results        Labeling  │      │
│                                                                                         │
└─────────────────────────────────────────────────────────────────────────────────────────┘

Processing Flow

 EMAIL ARRIVES          ANALYSIS PIPELINE              VERDICT                ACTION
      │                       │                          │                      │
      ▼                       ▼                          ▼                      ▼
┌───────────┐  ──▶  ┌───────────────────┐  ──▶  ┌────────────────┐  ──▶  ┌───────────┐
│  Gmail    │       │  Intent + Sandbox │       │  Risk Scoring  │       │  Apply    │
│  Pub/Sub  │       │  AI Analysis      │       │  0-100 Score   │       │  Labels   │
└───────────┘       └───────────────────┘       └────────────────┘       └───────────┘
                                                        │
                          ┌─────────────────────────────┼─────────────────────────────┐
                          │                             │                             │
                          ▼                             ▼                             ▼
                    ┌───────────┐               ┌───────────┐               ┌───────────┐
                    │   SAFE    │               │ CAUTIOUS  │               │  THREAT   │
                    │   0-29    │               │   30-79   │               │  80-100   │
                    └───────────┘               └───────────┘               └───────────┘

Worker Pipeline Details

Worker Port Technology Function
API 8000 FastAPI, SQLModel REST endpoints, OAuth, orchestration
Dashboard 3000 Next.js 16, React 19 Real-time monitoring UI
Ingest 8001 FastAPI, httpx Pub/Sub webhook, email fetching
Intent 8002 LangGraph, Gemini AI intent classification
Action 8003 Gmail API Label application, spam handling
Sandbox 8004 LangChain, OpenAI URL/attachment threat analysis
Aggregator 8005 Redis, asyncpg Result consolidation

AI-Powered Intent Classification

The Intent Worker uses LangGraph with Gemini AI to classify emails into 16 distinct categories

Threat Categories (High Risk: 75-98)

Intent Risk Score Description
MALWARE 98 Malicious attachment/download links
PHISHING 95 Credential harvesting attempts
BEC_FRAUD 95 Business Email Compromise scams
SOCIAL_ENGINEERING 90 Manipulation/impersonation tactics
RECONNAISSANCE 75 Information gathering probes
SPAM 60 Unsolicited bulk messages

Business Categories (Medium Risk: 30-50)

Intent Risk Score Description
PAYMENT 45 Payment requests/confirmations
INVOICE 40 Invoice-related communications
SALES 30 Sales/marketing outreach

Legitimate Categories (Low Risk: 10-25)

Intent Risk Score Description
NEWSLETTER 25 Subscribed newsletters
SUPPORT 20 Customer support threads
MEETING_REQUEST 15 Calendar invitations
TASK_REQUEST 15 Work assignments
PERSONAL 10 Personal correspondence
FOLLOW_UP 10 Thread continuations

Risk Classification Tiers

Score Range Tier Gmail Label Action
0-29 SAFE MailShield/SAFE No action needed
30-79 CAUTIOUS MailShield/CAUTIOUS Manual review suggested
80-100 THREAT MailShield/MALICIOUS Auto-move to spam (optional)

Quick Start

Prerequisites

Python 3.12+ with uv package manager
Node.js 18+ with pnpm
PostgreSQL 15+ and Redis 7+
Google Cloud project with Gmail API & OAuth configured

1. Clone & Install

# Clone repository
git clone https://github.com/atf-inc/dec25_intern_B_security.git
cd dec25_intern_B_security

# Install uv (Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install pnpm
npm install -g pnpm

# Install all dependencies
uv sync                    # Python dependencies
npm run install:all        # Node dependencies

2. Configure Environment

cp example.env .env

Edit .env with your credentials:

# Database
DATABASE_URL=postgresql://postgres:password@localhost:5432/mailshieldai

# Redis
REDIS_URL=redis://localhost:6379

# Google OAuth (from Google Cloud Console)
AUTH_GOOGLE_ID=your-client-id.apps.googleusercontent.com
AUTH_GOOGLE_SECRET=your-client-secret

# NextAuth
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=generate-with-openssl-rand-base64-32

# AI
GEMINI_API_KEY=your-gemini-api-key

# CORS
CORS_ALLOW_ORIGINS=http://localhost:3000

3. Start Services

# Terminal 1: Start PostgreSQL & Redis (via Docker)
docker run --name mailshield-db -e POSTGRES_PASSWORD=postgres -e POSTGRES_DB=mailshieldai -p 5432:5432 -d postgres:16
docker run --name mailshield-redis -p 6379:6379 -d redis:7-alpine

# Terminal 2: Initialize database
npm run db:init

# Terminal 3: Start all services
npm run dev:all

4. Access the Application


API Reference

Authentication

Endpoint Method Description
/api/auth/me GET Get current user info

Email Operations

Endpoint Method Description
/api/emails GET List analyzed emails with pagination
/api/emails/ingest POST Manual email ingestion trigger
/api/emails/sync/background POST Pub/Sub webhook endpoint

Statistics

Endpoint Method Description
/api/stats GET Email statistics & threat counts

System

Endpoint Method Description
/health GET API health check
/ GET Worker health (ports 8001-8005)

Full API documentation: http://localhost:8000/docs


Development

Available Scripts

# Development
npm run dev              # API + Dashboard
npm run dev:all          # All 7 services
npm run dev:api          # FastAPI only
npm run dev:web          # Next.js only
npm run dev:intent       # Intent worker only
npm run dev:action       # Action worker only

# Database
npm run db:init          # Initialize/seed database
npm run db:seed          # Re-seed with sample data

# Production
npm run build:web        # Build Next.js
npm run start:all        # Start all services (PM2 compatible)

# Code Quality
npm run lint:web         # ESLint for frontend

Development Mode

Enable DEV_MODE in .env to bypass strict authentication:

DEV_MODE=true

Use dev_anytoken as the bearer token for API testing.


Project Structure

MailShieldAI/
├── apps/
│   ├── api/                       # FastAPI Backend
│   │   ├── main.py               # App entry, CORS config
│   │   ├── routers/              # API route handlers
│   │   │   ├── auth.py           # Google OAuth endpoints
│   │   │   ├── emails.py         # Email CRUD operations
│   │   │   └── stats.py          # Dashboard statistics
│   │   └── services/             # Business logic layer
│   │
│   ├── web/                       # Next.js Dashboard
│   │   ├── app/                  # App Router pages
│   │   ├── components/           # 28 React components
│   │   ├── lib/                  # Utilities & API client
│   │   └── auth.ts               # NextAuth configuration
│   │
│   └── worker/                    # Microservices
│       ├── ingest/               # Pub/Sub message handler
│       ├── intent/               # LangGraph AI classifier
│       │   ├── graph.py          # LangGraph workflow
│       │   ├── taxonomy.py       # 16 intent categories
│       │   └── prompts.py        # Gemini prompts
│       ├── action/               # Gmail labeler
│       │   ├── main.py           # Worker entry
│       │   └── gmail_labels.py   # Label management
│       ├── analyses/             # Sandbox analyzer
│       └── aggregator/           # Result consolidator
│
├── packages/
│   └── shared/                    # Shared Python Modules
│       ├── database.py           # Async PostgreSQL
│       ├── models.py             # SQLModel schemas
│       ├── queue.py              # Redis Streams client
│       ├── types.py              # Pydantic types
│       ├── logger.py             # Structured logging
│       └── constants.py          # Enums & constants
│
├── scripts/
│   └── seed_db.py                # Database seeding
│
├── .github/workflows/
│   └── main.yml                  # CI/CD deployment
│
├── pyproject.toml                # Python dependencies (uv)
├── package.json                  # NPM scripts & Node deps
└── example.env                   # Environment template

Security Features

Feature Implementation
Multi-Layer Analysis Intent + Sandbox + Aggregation pipeline
OAuth 2.0 Google authentication with token refresh
CORS Protection Strict origin validation (no wildcards with credentials)
Rate Limiting Gmail API semaphore (5 concurrent requests)
PII Masking Email addresses anonymized in logs
Idempotency In-memory deduplication for processed messages
Email Auth Checks SPF, DKIM, DMARC validation

Environment Variables

Required Variables

Variable Description
DATABASE_URL PostgreSQL connection string
REDIS_URL Redis connection string
AUTH_GOOGLE_ID Google OAuth Client ID
AUTH_GOOGLE_SECRET Google OAuth Client Secret
NEXTAUTH_SECRET NextAuth.js session secret
GEMINI_API_KEY Google Gemini API key
CORS_ALLOW_ORIGINS Allowed frontend origins

Optional Variables

Variable Default Description
DEV_MODE false Enable development mode
POLL_INTERVAL_SECONDS 5 Worker polling interval
MOVE_MALICIOUS_TO_SPAM true Auto-move threats to spam
NEXT_PUBLIC_API_URL http://localhost:8000 API URL for frontend

Deployment

GitHub Actions CI/CD

The repository includes automated deployment via GitHub Actions

# .github/workflows/main.yml
# Deploys to VM on push to main branch
# Uses PM2 for process management

Required Secrets:

Secret Description
SSH_PRIVATE_KEY Deployment key
SSH_HOST Target VM hostname
SSH_USER SSH username

Production Checklist

  • Set DEV_MODE=false
  • Configure production DATABASE_URL with SSL
  • Use Cloud Redis (e.g., Memorystore)
  • Set up Google Cloud Pub/Sub for real-time sync
  • Configure proper CORS_ALLOW_ORIGINS
  • Enable Cloud Logging integration

Contributing

1. Fork the repository
2. Create a feature branch (git checkout -b feature/amazing-feature)
3. Commit changes (git commit -m 'Add amazing feature')
4. Push to branch (git push origin feature/amazing-feature)
5. Open a Pull Request


Support

For questions or support, please open an issue on GitHub.


License

This project is proprietary. No open-source license is currently applied.


Built by the MailShieldAI Team

GitHubLive Demo

About

GitHub Repo for Team B

Resources

Stars

Watchers

Forks

Contributors 9