I wanted full-text search for my claude.ai chats, so I made this.
- Multi-provider: currently Claude and ChatGPT.
- Multi-account per provider.
- Smart search ranking
- Hyperlinks in the results: easy resume when you've found that chat
- JSON output supported when searching
- Local view: copy chats to Markdown or HTML, open in
$EDITOR - Non-destructive sync: preserves a chat even if you deleted it on the website. Export/sync the last 30 days only and it'll preserve your older chats.
- Export backup: automatic archive of your data export zipfiles
- UUID tracking: Correctly handles conversation renames
- Simple: just a folder of python scripts. Works with system python.
- Completely offline
git clone git@github.com:marisawallace/clauding-at-home.git
cd clauding-at-home
chmod +x sync_local_chats_archive.py
chmod +x full_text_search_chats_archive.py
chmod +x view_conversation.py
# EDIT THIS IF YOU DON'T WANT TO SEARCH IN ~/Downloads
cat > .env << 'EOF'
# Where to search for export zip files (optional, defaults to current directory)
ZIP_SEARCH_DIR=~/Downloads
EOF
I highly recommend adding aliases to your .bashrc or equivalent.
alias ccs="cd $CODE_HOME/clauding-at-home/"
alias cs="python3 $CODE_HOME/clauding-at-home/full_text_search_chats_archive.py"
alias csscl="python3 $CODE_HOME/clauding-at-home/sync_local_chats_archive.py --claude"
alias cssch="python3 $CODE_HOME/clauding-at-home/sync_local_chats_archive.py --chatgpt"
alias csv="python3 $CODE_HOME/clauding-at-home/view_conversation.py"
alias csvh="python3 $CODE_HOME/clauding-at-home/view_conversation.py --format html"
Make sure you have $EDITOR set.
export VISUAL=subl
export EDITOR="$VISUAL"
- https://claude.ai/settings/data-privacy-controls
- Click "Export data"
- Download the .zip file
your-alias,csscl, orpython3 sync_local_chats_archive.py --claude
- https://chatgpt.com/#settings/DataControls
- Click "Export data"
- Download the .zip file
your-alias,cssch, orpython3 sync_local_chats_archive.py --chatgpt
The sync script will:
- Find all export zip files matching the provider's pattern
- Extract and organize conversations/projects by provider and user email
- Update existing conversations (matched by UUID)
- Preserve locally archived chats that were deleted from the provider
- Handle duplicate filenames with numeric suffixes
- Move processed zip files to
data/archived_exports/{provider}/
The sync script includes multiple safety mechanisms:
- Dual UUID verification: Matches both conversation UUID and account UUID before updates
- Cross-account protection: Won't delete files if account UUIDs don't match
- Collision detection: Logs warnings if UUID conflicts are detected across accounts
- Non-destructive by design: Preserves files that don't match current export
- Validation checks: Verifies export format before processing
Then everything should just work!
# Search
cs "hi claude"
# View (copy/paste the UUID from the search results)
# This will open that conversation in a persistent markdown file.
# You can edit this file without destroying any of the original
# export data.
csv UUID
# Directly open the top 3 results for "books" in your `$EDITOR`
cs books -o 3
# JSON output
cs books -j > results.json
clauding-at-home/
├── data/ # Sync this entire folder (e.g. with MEGA)
│ ├── llm_data/ # Organized chat archives
│ │ ├── claude/
│ │ │ └── user@example.com/
│ │ │ ├── conversations/
│ │ │ │ └── YYYY-MM-DD_Title.json
│ │ │ ├── projects/
│ │ │ │ └── YYYY-MM-DD_Project.json
│ │ │ └── user.json
│ │ └── chatgpt/
│ │ └── user@example.com/
│ │ ├── conversations/
│ │ │ └── YYYY-MM-DD_Title.json
│ │ └── user.json
│ ├── archived_exports/ # Processed export zip files
│ │ ├── claude/
│ │ │ └── data-YYYY-MM-DD-*.zip
│ │ └── chatgpt/
│ │ └── [hex]-YYYY-MM-DD-*.zip
│ └── local_views/ # Generated Markdown/HTML views
│ ├── claude/
│ │ ├── {uuid}.md
│ │ └── {uuid}.html
│ └── chatgpt/
│ ├── {uuid}.md
│ └── {uuid}.html
├── migrations/ # One-time data migration scripts
│ └── 001_consolidate_data_dirs.py
├── sync_local_chats_archive.py # Import and sync exports
├── full_text_search_chats_archive.py # Search conversations
└── view_conversation.py # View conversations as MD/HTML
The official Claude.ai data export does not fully preserve forked conversations. Specifically:
- Human messages from all branches are included in the export (as consecutive same-sender entries in
chat_messages). - Assistant responses from non-selected branches are missing. Only the response from the branch you last had selected is exported.
This means search results may not include text from assistant responses in branches you didn't select. There is no workaround within this tool since the data simply isn't present in the export.
Workarounds:
- Before exporting, revisit conversations with important forks and switch to each branch you care about (the export appears to capture whichever branch is active).
- Python: 3.7 or higher
- Dependencies: None (uses standard library only)
To run the test suite:
# Option 1: Virtual environment (works on all platforms)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements-test.txt
pytest
# Option 2: System package manager
# Debian/Ubuntu: sudo apt install python3-pytest
# Fedora: sudo dnf install python3-pytest
# Arch: sudo pacman -S python-pytest
# macOS: brew install pytest
# Run all tests
pytest
# Run with verbose output
pytest -v
# Run specific test file
pytest tests/integration/test_sync_workflow.pySee tests/README.md for detailed testing documentation, including test structure, fixtures, and debugging tips.
This tool is designed to be extensible. To add support for a new provider:
- Create a new
Providersubclass insync_local_chats_archive.py - Implement the required methods (
name(),find_exports(),extract_data(),validate()) - Add provider-specific URL generation to
SearchResult.get_provider_url() - Update documentation with export format details
