ChatX - Privacy-Focused Chat Analysis

ChatX is a privacy-focused, local-first CLI tool for forensic chat analysis. It extracts, transforms, and analyzes chat data from multiple platforms while maintaining strict privacy controls.

Features

Multi-Platform Support: iMessage, Instagram DM, WhatsApp, and text files
Privacy-First: Local processing with optional cloud LLM integration only after redaction
Differential Privacy: (ε,δ)-DP statistical aggregation for safe cloud insights
Lossless Extraction: Preserves original data fidelity via source metadata
Schema-Driven: Well-defined JSON schemas for all data formats
Extensible Architecture: Plugin-based extractors for new platforms

Quick Start

Installation

# Clone the repository
git clone https://github.com/hue/Dopemux-ChatRipperXX.git
cd Dopemux-ChatRipperXX

# Install in development mode
pip install -e ".[dev]"

Basic Usage

# Extract iMessage for a contact
chatx imessage pull --contact "+15551234567" --db ~/Library/Messages/chat.db --out ./out

# Extract iMessage from an iPhone backup (Finder/iTunes MobileSync)
chatx imessage pull --contact "+15551234567" --from-backup "~/Library/Application Support/MobileSync/Backup/<UDID>" --out ./out

# Extract Instagram DMs for a single user (required)
chatx instagram pull --zip ./instagram.zip --user "Your Name" --out ./out

# Audit iMessage DB for missing local attachments (report-only)
chatx imessage audit --db ~/Library/Messages/chat.db --out ./out

# Ingest PDF conversation export (text-first, OCR fallback)
chatx imessage pdf --pdf ./conversation.pdf --me "Your Name" --out ./out

# Run full pipeline with privacy redaction
chatx pipeline ~/Library/Messages/chat.db ./output --provider local

# Get help
chatx --help

Architecture

ChatX follows a pipeline architecture:

Extract: Platform-specific extractors convert native formats to canonical JSON
Transform: Data transformation pipeline normalizes and validates messages
Redact: Policy Shield removes sensitive information before any cloud processing
Enrich: Optional LLM processing adds semantic metadata
Analyze: Generate insights and reports from enriched data

Privacy Principles

Local-First Processing: All sensitive operations happen locally
Explicit Consent: Cloud LLM integration requires explicit user consent
Redaction Transparency: Detailed reports show what data was removed
Source Preservation: Original data preserved in source_meta fields

Supported Platforms

Platform	Status	Data Source
iMessage	✅ Complete	macOS chat.db SQLite database
Instagram DM	✅ Initial extractor	Official data ZIP export
WhatsApp	🚧 Planned	Text export files
Generic Text	🚧 Planned	Plain text conversation files

Development

This project follows modern Python packaging standards with:

Package Management: pyproject.toml with optional dependencies
Code Quality: ruff for linting, mypy for type checking
Testing: pytest with coverage reporting
Documentation: mkdocs with material theme
CI/CD: GitHub Actions for testing and docs deployment

Project Structure

src/chatx/           # Main package
├── cli/            # Command-line interface
├── extractors/     # Platform-specific extractors  
├── schemas/        # Pydantic data models
├── transformers/   # Data transformation pipeline
├── redaction/      # Privacy redaction system
├── enrichment/     # LLM integration
└── utils/          # Common utilities

tests/              # Test suite
├── unit/           # Unit tests
├── integration/    # Integration tests  
└── fixtures/       # Test data

docs/               # Documentation
schemas/            # JSON Schema definitions
config/             # Configuration files

Running Tests

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=src/chatx --cov-report=html

Documentation

Documentation is built with MkDocs and deployed automatically:

# Serve docs locally  
mkdocs serve

# Build docs
mkdocs build

Contributing

Read the Contributing Guide
Review the Code of Conduct
Check existing Issues
Follow the development process in CLAUDE.md

License

MIT License - see LICENSE for details.

Security

This tool handles sensitive personal data. Please review the Security Threat Model, see our Security & Vulnerability Reporting guidance, and follow security best practices.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.claude		.claude
.github		.github
.serena		.serena
.taskmaster		.taskmaster
config		config
context_portal		context_portal
docs		docs
node_modules		node_modules
schemas		schemas
scripts		scripts
src/chatx		src/chatx
tests		tests
.aicommitsrc		.aicommitsrc
.cz.toml		.cz.toml
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.ruff.toml		.ruff.toml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CLAUDE.md.backup		CLAUDE.md.backup
QUICK_VERIFICATION.py		QUICK_VERIFICATION.py
README.md		README.md
architecture-comprehensive.md		architecture-comprehensive.md
llm-context.md		llm-context.md
llms.md		llms.md
mkdocs.yml		mkdocs.yml
mypy.ini		mypy.ini
patch.diff		patch.diff
pyproject.toml		pyproject.toml
test_integration.py		test_integration.py
test_taskmaster.json		test_taskmaster.json
test_zen.json		test_zen.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatX - Privacy-Focused Chat Analysis

Features

Quick Start

Installation

Basic Usage

Architecture

Privacy Principles

Supported Platforms

Development

Project Structure

Running Tests

Documentation

Contributing

License

Security

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ChatX - Privacy-Focused Chat Analysis

Features

Quick Start

Installation

Basic Usage

Architecture

Privacy Principles

Supported Platforms

Development

Project Structure

Running Tests

Documentation

Contributing

License

Security

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages