MXTOAI Email Processing System

A robust email processing system that can handle, analyze, and respond to emails with advanced attachment processing capabilities.

Features

Email Summarization: Generate concise, structured summaries of email content including key points and action items.
Smart Reply Generation: Create context-aware email replies based on the email's purpose and content.
Advanced Attachment Processing: Analyze various types of attachments including:
- Documents: PDF, DOCX, XLSX, PPTX, TXT, HTML
- Images: JPG, PNG, GIF with Azure Vision-powered image captioning
- Media: Various media file types with basic metadata extraction
Deep Research: Optional integration with research APIs to provide deeper insights on email topics.
Multiple Processing Modes: Process emails in different modes depending on your needs:
- summary: Just generate a summary
- reply: Just generate a reply
- research: Perform research based on the email content
- full: Complete processing (summary, reply, and research)
Rich Text Formatting: Supports both HTML and plain text email responses with proper formatting
Attachment Analysis: Provides detailed summaries of attachment contents in the email response
Error Resilience: Graceful handling of processing errors with fallback responses
Asynchronous Processing: Uses Dramatiq for reliable background task processing
Scalable Architecture: Multiple workers can process emails concurrently

Directory Structure

mxtoai/
├── agents/                 # Agent implementations for different tasks
│   └── email_agent.py     # Main email processing agent implementation
├── tools/                 # Individual tool implementations
│   ├── attachment_processing_tool.py  # Attachment handling
│   ├── email_reply_tool.py        # Email reply generation
│   └── deep_research_tool.py      # Research capabilities
├── scripts/              # Utility scripts and helpers
│   ├── visual_qa.py     # Azure Vision integration for images
│   ├── citation_tools.py # Citation and reference handling
│   ├── text_web_browser.py # Web content retrieval
│   └── report_formatter.py  # Email response formatting
├── attachments/         # Temporary storage for attachments
└── ai.py              # AI model configurations and utilities

Architecture

The system uses a message queue architecture with Dramatiq for reliable email processing:

API Layer: Receives email requests and queues them for processing
Message Queue: Uses Redis as the message broker
Worker Processes: Multiple Dramatiq workers process emails concurrently
Error Handling: Built-in retry mechanism for failed tasks

Setup and Installation

Prerequisites

Python 3.12+
Redis server (for message queue)
Azure OpenAI API access
Azure Vision API access (for image processing)

Installation

The project uses Poetry for dependency management. Here's how to set it up:

First, install Poetry if you haven't already:

# On macOS/Linux/WSL
curl -sSL https://install.python-poetry.org | python3 -

# On Windows (PowerShell)
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -

Clone and set up the project:

# Clone the repository
git clone https://github.com/satwikkansal/mxtoai.git
cd mxtoai

# Install dependencies using Poetry
poetry install

# Activate the virtual environment
poetry shell

Start RabbitMQ server:

brew services restart rabbitmq

Start the API server:

poetry run python run_api.py

Start the workers:

Using only single process and couple of threads for local development:

poetry run dramatiq mxtoai.tasks --processes 1 --threads 2 --watch ./.

Docker Setup (Alternative Installation)

The project can also be run using Docker Compose, which provides an isolated environment with all required services.

Ensure you have Docker and Docker Compose installed on your system.
Build and start all services:

docker compose up -d

Access the services:

API Server: http://localhost:8000
RabbitMQ Management: http://localhost:15672 (credentials: guest/guest)
Redis: localhost:6379
Ollama: localhost:11434 (optional)

Service Details

API Server: FastAPI application running on port 8000
Worker: Background task processor using Dramatiq
Redis: Used for caching and session management
RabbitMQ: Message broker for task queue
Ollama: Optional LLM service (disabled by default)

Running with Ollama

To include the Ollama service (required for local LLM processing):

docker compose --profile ollama up -d

Stopping Services

# Stop all services
docker compose down

# Stop and remove all data volumes (this will delete all data)
docker compose down -v

Important Notes

The Docker setup includes all required services (Redis, RabbitMQ) automatically
Model configuration file (model.config.toml) should be placed in the credentials/ directory
All services are configured to restart automatically unless stopped manually
Data persistence is enabled for Redis, RabbitMQ, and Ollama through Docker volumes

Environment Variables

Copy the .env.example file to .env and update with your specific configuration:

LITELLM_CONFIG_PATH=model.config.toml

# Redis configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_DB=0
REDIS_PASSWORD=

# rabbitmq config
RABBITMQ_HOST=localhost
RABBITMQ_PORT=5672
RABBITMQ_USER=guest
RABBITMQ_PASSWORD=guest
RABBITMQ_VHOST=/
RABBITMQ_HEARTBEAT=60 # Default heartbeat interval in seconds

# server config
PORT=8000
HOST=0.0.0.0
LOG_LEVEL=INFO
IS_PROD=false
X_API_KEY=your_api_key

# supabase
SUPABASE_URL=your_supabase_url
SUPABASE_KEY=your_supabase_key
SUPABASE_SERVICE_ROLE_KEY=your_supabase_service_role_key
WHITELIST_SIGNUP_URL=your_whitelist_signup_url # e.g., https://yourdomain.com/

# open ai api key
AZURE_OPENAI_API_KEY=your_api_key_here

# Hugging Face Token
HF_TOKEN=your_huggingface_token

# AWS SES Configuration
AWS_REGION=your_aws_region # e.g., ap-south-1
AWS_ACCESS_KEY_ID=your_aws_access_key_id
AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key
SENDER_EMAIL=your_sender_email@domain.com

# External services
JINA_API_KEY="YOUR_JINA_API_KEY" # Leave blank if not using deep research
BRAVE_SEARCH_API_KEY=""
RAPIDAPI_KEY=""

# Optional for research functionality
JINA_API_KEY=your-jina-api-key

# For image processing
AZURE_VISION_ENDPOINT=your-azure-vision-endpoint
AZURE_VISION_KEY=your-azure-vision-key

# For web search functionality
SERPAPI_API_KEY=your-serpapi-api-key

This project supports load balancing and routing across multiple models, so you can define as many models as you'd like. Copy credentials/model.config.example.toml to a toml file and update it in the same directory with your preferred configuration. Update .env with the path your toml relative to root.

A sample configuration looks like this:

[[model]]
model_name = "gpt-4"

[model.litellm_params]
model = "azure/gpt-4"
base_url = "https://your-endpoint.openai.azure.com"
api_key = "your-key"
api_version = "2023-05-15"
weight = 5

It is also recommended that you set router configuration. It will be defaulted to the below config if not set.

[router_config]
routing_strategy = "simple-shuffle"

[[router_config.fallbacks]]
gpt-4 = ["gpt-4-reasoning"]

[router_config.default_litellm_params]
drop_params = true

API Endpoints

Process Email

POST /process-email

Response Example

{
    "message": "Email received and queued for processing",
    "email_id": "1743315736--3926735152910876943",
    "attachments_saved": 1,
    "status": "processing"
}

The email will be processed asynchronously by the workers. You can implement a status check endpoint to monitor the processing status.

Email Processing Flow

graph TD
    A[Incoming Email] --> B[Email Routing]
    B --> C{Determine Mode}

    C -->|summarize@| D[Summary Mode]
    C -->|reply@| E[Reply Mode]
    C -->|research@| F[Research Mode]
    C -->|ask@| G[Full Mode]

    %% Attachment Processing
    A --> H[Attachment Detection]
    H --> I{File Type}
    I -->|Images| J[Azure Vision Analysis]
    I -->|Documents| K[Document Processing]
    I -->|Other| L[Metadata Extraction]

    J --> M[Generate Captions]
    K --> N[Extract Content]
    L --> O[Basic Info]

    M & N & O --> P[Attachment Summary]

    %% Mode Processing
    D & E & F & G --> Q[Process Request]
    P --> Q

    Q --> R[Format Response]
    R --> S[Generate HTML]
    R --> T[Generate Text]

    %% Error Handling
    Q --> U{Errors?}
    U -->|Yes| V[Fallback Response]
    U -->|No| R

    %% Final Response
    S & T --> W[Final Response]
    V --> W

    %% Styling
    classDef email fill:#f9f,stroke:#333,stroke-width:2px
    classDef process fill:#bbf,stroke:#333,stroke-width:2px
    classDef error fill:#fbb,stroke:#333,stroke-width:2px

    class A,B email
    class D,E,F,G,Q,R process
    class U,V error

Running the API Server

# Start the FastAPI server
uvicorn api:app --reload

Advanced Features

Attachment Processing

The system now supports:

Automatic content extraction from documents
Azure Vision-powered image analysis and captioning
Fallback processing for unsupported file types
Size-aware content summarization
Error resilient processing

Response Formatting

Rich text formatting with markdown support
Both HTML and plain text versions
Automatic signature handling
Attachment content integration
Professional formatting with sections and highlights

Error Handling

Graceful degradation on processing failures
Detailed error tracking and reporting
Fallback responses for partial failures
Comprehensive error logging

Load Testing

The project uses Locust for load testing various email processing scenarios.

Setup & Run

Install and setup:

pip install locust
mkdir -p test_files
# Add 2-3 PDF files to test_files/ directory

Run tests:

# Interactive mode (Recommended)
locust --host=http://localhost:8000

# Or headless mode
poetry run locust --host=http://localhost:9192 --users 10 --spawn-rate 2 --run-time 1m --headless

Test Scenarios

Simple queries (50%): Complex questions to summarise@mxtoai.com
Translation requests (20%): Technical content to translate@mxtoai.com
Document analysis (30%): PDF attachments to ask@mxtoai.com

Results & Monitoring

Real-time stats in Web UI (http://localhost:8089)
System metrics in results/system_stats.csv
HTML report and logs in results/ directory

For detailed configuration, check locustfile.py.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MXTOAI Email Processing System

Features

Directory Structure

Architecture

Setup and Installation

Prerequisites

Installation

Docker Setup (Alternative Installation)

Service Details

Running with Ollama

Stopping Services

Important Notes

Environment Variables

API Endpoints

Process Email

Response Example

Email Processing Flow

Running the API Server

Advanced Features

Attachment Processing

Response Formatting

Error Handling

Load Testing

Setup & Run

Test Scenarios

Results & Monitoring

Contributing

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

MXTOAI Email Processing System

Features

Directory Structure

Architecture

Setup and Installation

Prerequisites

Installation

Docker Setup (Alternative Installation)

Service Details

Running with Ollama

Stopping Services

Important Notes

Environment Variables

API Endpoints

Process Email

Response Example

Email Processing Flow

Running the API Server

Advanced Features

Attachment Processing

Response Formatting

Error Handling

Load Testing

Setup & Run

Test Scenarios

Results & Monitoring

Contributing