Skip to content

rmichelena/x_to_raindrop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

X Bookmarks to Raindrop.io Converter

πŸ“‹ Project Overview

This project converts Twitter/X bookmarks exported as JSON into a CSV format compatible with Raindrop.io, using OpenAI's GPT-3.5-turbo for intelligent title generation and folder/tag reclassification.

Supports two export formats:

  1. Twitter Bookmark Exporter (Chrome extension) - July 2025
  2. BirdBear (app/extension) - August 2025

🎯 Goal

Transform JSON files containing Twitter bookmarks into a clean, organized CSV that can be imported into Raindrop.io with:

  • Intelligent title generation (concise one-liners)
  • Smart folder and tag reclassification
  • Proper nested folder structure under "X/"
  • Clean tag taxonomy (removing low-usage tags)

πŸ“ Project Structure

X Bookmarks to Raindrop/
β”œβ”€β”€ README.md                           # This file
β”œβ”€β”€ sources/                            # Source JSON export files
β”‚   β”œβ”€β”€ twitter_exporter/               # Twitter Bookmark Exporter format
β”‚   β”‚   β”œβ”€β”€ twitter_bookmarks.json              # Original export
β”‚   β”‚   └── twitter_bookmarks_tagged_full.json  # Manually enhanced (1,778 bookmarks)
β”‚   └── birdbear/                       # BirdBear format
β”‚       └── birdbear-export-2025-08-16T21_20_47.259Z.json
β”œβ”€β”€ scripts/                            # Python conversion scripts
β”‚   β”œβ”€β”€ shared/                         # Scripts that work with both formats
β”‚   β”‚   β”œβ”€β”€ reclassify_with_openai.py  # Main conversion script (supports both formats)
β”‚   β”‚   β”œβ”€β”€ analyze_folders_tags.py    # Extract folders/tags from JSON
β”‚   β”‚   β”œβ”€β”€ analyze_tags.py            # Analyze tag usage in CSV
β”‚   β”‚   β”œβ”€β”€ clean_tags.py              # Remove low-usage tags
β”‚   β”‚   └── setup_openai.py            # OpenAI API key setup utility
β”‚   └── twitter_exporter/               # Scripts specific to Twitter Exporter format
β”‚       β”œβ”€β”€ json_to_raindrop_csv.py    # Basic conversion with OpenAI titles
β”‚       β”œβ”€β”€ create_clean_csv.py        # Clean CSV generation
β”‚       └── create_raindrop_json.py    # JSON format generation
β”œβ”€β”€ outputs/                            # Generated CSV/JSON files
β”‚   β”œβ”€β”€ twitter_exporter/               # Outputs from Twitter Exporter format
β”‚   β”‚   β”œβ”€β”€ raindrop_format.csv        # Final output (all bookmarks)
β”‚   β”‚   β”œβ”€β”€ raindrop_cleaned.csv        # Cleaned output (tags with β‰₯5 uses)
β”‚   β”‚   └── [other CSV variations]
β”‚   β”œβ”€β”€ birdbear/                       # Outputs from BirdBear format
β”‚   β”‚   └── birdbear_reclassified.csv
β”‚   └── test/                           # Test/sample outputs
β”‚       └── test_*.csv
└── references/                         # Reference files and configuration
    β”œβ”€β”€ APIKEY.txt                      # OpenAI API key (user-provided)
    β”œβ”€β”€ folders_list.txt                # Extracted folder list for AI prompts
    └── tags_list.txt                   # Extracted tag list for AI prompts

πŸ”§ Dependencies

pip install openai

Environment Variables

  • OPENAI_API_KEY: Your OpenAI API key (set via scripts/shared/setup_openai.py or manually)

πŸ“Š Supported Export Formats

1. Twitter Bookmark Exporter Format

  • Source: Chrome extension "Twitter Bookmark Exporter"
  • Format: Array of bookmark objects with text, author, timestamp, link, id
  • Files: sources/twitter_exporter/twitter_bookmarks.json
  • Scripts: scripts/twitter_exporter/*.py or scripts/shared/reclassify_with_openai.py

2. BirdBear Format

  • Source: BirdBear app/extension
  • Format: JSON object with version, exported_at, tweets array containing full tweet metadata
  • Files: sources/birdbear/birdbear-export-*.json
  • Scripts: scripts/shared/reclassify_with_openai.py (auto-detects format)

πŸš€ Quick Start

For Twitter Bookmark Exporter Format

1. Setup OpenAI API Key

python3 scripts/shared/setup_openai.py

Or manually set: export OPENAI_API_KEY="your-key-here"

2. (Optional) Deduplicate New Export

If you've exported bookmarks again and there's overlap with a previous export:

# Remove duplicates from newer export
python3 scripts/twitter_exporter/deduplicate_exports.py \
  sources/twitter_exporter/older_export.json \
  sources/twitter_exporter/newer_export.json \
  -o sources/twitter_exporter/new_bookmarks_only.json

3. Prepare Enhanced JSON (Manual Step - Optional)

  • Start with sources/twitter_exporter/twitter_bookmarks.json (original export)
  • Manually add "folder" and "tags" fields to each bookmark
  • Save as sources/twitter_exporter/twitter_bookmarks_tagged_full.json

4. Extract Folder/Tag Lists (if using tagged JSON)

python3 scripts/shared/analyze_folders_tags.py

5. Generate Raindrop CSV

# Using the universal script (recommended)
python3 scripts/shared/reclassify_with_openai.py sources/twitter_exporter/twitter_bookmarks.json -o outputs/twitter_exporter/raindrop_reclassified.csv

# Or using format-specific script
python3 scripts/twitter_exporter/json_to_raindrop_csv.py

6. Clean Tags (Recommended)

python3 scripts/shared/clean_tags.py outputs/twitter_exporter/raindrop_format.csv outputs/twitter_exporter/raindrop_cleaned.csv

For BirdBear Format

1. Setup OpenAI API Key

python3 scripts/shared/setup_openai.py

2. Generate Raindrop CSV

# Auto-detects BirdBear format
python3 scripts/shared/reclassify_with_openai.py sources/birdbear/birdbear-export-*.json -o outputs/birdbear/birdbear_reclassified.csv

# Test with first 10 bookmarks
python3 scripts/shared/reclassify_with_openai.py sources/birdbear/birdbear-export-*.json -o outputs/test/test_birdbear.csv -t 10

3. Clean Tags (Recommended)

python3 scripts/shared/clean_tags.py outputs/birdbear/birdbear_reclassified.csv outputs/birdbear/birdbear_cleaned.csv

6. Import to Raindrop.io

Upload the cleaned CSV file from outputs/ to Raindrop.io

πŸ“‹ Scripts Documentation

Core Scripts

scripts/shared/reclassify_with_openai.py 🎯 RECOMMENDED

Purpose: Universal conversion script that supports both export formats

  • Auto-detects format (Twitter Exporter or BirdBear)
  • Generates concise titles (max 100 chars) using GPT-3.5-turbo
  • Reclassifies folders and tags based on content analysis
  • Outputs CSV in exact Raindrop.io export format
  • Single OpenAI API call per bookmark (cost-optimized)

Usage:

python3 scripts/shared/reclassify_with_openai.py <input.json> -o <output.csv> [-t N]

Key Features:

  • Format auto-detection
  • Single OpenAI API call per bookmark
  • Text cleaning for CSV compatibility
  • ISO 8601 timestamp format
  • Nested folders under "X/"
  • Adds "twitter" tag to all entries

scripts/shared/clean_tags.py 🧹

Purpose: Removes tags with less than 5 uses to create cleaner taxonomy

  • Reduces unique tags significantly (84% reduction example)
  • Maintains meaningful tags only
  • Provides detailed analysis of tag usage

Usage:

python3 scripts/shared/clean_tags.py <input.csv> <output.csv> [min_uses=5]

Twitter Exporter Specific Scripts

scripts/twitter_exporter/deduplicate_exports.py πŸ”„ NEW

Purpose: Remove duplicates from newer export by comparing with older export

  • Compares two Twitter Bookmark Exporter JSON files
  • Uses tweet ID (id field or extracted from URL) to identify duplicates
  • Outputs only new bookmarks that don't exist in older export
  • Useful when Chrome extension exports all current bookmarks (not just new ones)

Usage:

python3 scripts/twitter_exporter/deduplicate_exports.py older.json newer.json -o new_only.json [-v]

Key Features:

  • Automatic ID extraction from id field or URL
  • Handles bookmarks without IDs gracefully
  • Detailed statistics and verbose mode
  • Preserves all bookmark data structure

scripts/twitter_exporter/json_to_raindrop_csv.py

Purpose: Basic conversion with OpenAI title generation

  • Works with twitter_bookmarks_tagged_full.json
  • Generates titles using OpenAI
  • Simple CSV output

scripts/twitter_exporter/create_clean_csv.py

Purpose: Clean CSV generation with folder/tag reclassification

  • Uses folders_list.txt and tags_list.txt
  • Single API call for title + classification

scripts/twitter_exporter/create_raindrop_json.py

Purpose: Generate JSON format compatible with Raindrop.io

  • Creates structured JSON output
  • Includes folder and tag classification

Utility Scripts

scripts/shared/setup_openai.py πŸ”‘

Purpose: Secure OpenAI API key setup

  • Prompts for API key securely (no echo)
  • Sets environment variable
  • Validates key format

scripts/shared/analyze_folders_tags.py πŸ“Š

Purpose: Extract unique folders and tags from source JSON

  • Creates references/folders_list.txt and references/tags_list.txt
  • Used as reference lists for OpenAI reclassification
  • Provides usage statistics

Note: Update script paths if running from different directory:

analyze_bookmarks("../sources/twitter_exporter/twitter_bookmarks_tagged_full.json")

scripts/shared/analyze_tags.py πŸ“ˆ

Purpose: Analyze tag distribution in generated CSV

  • Counts total vs unique tags
  • Shows most popular tags
  • Helps understand Raindrop.io import statistics

Note: Update script paths if running from different directory:

analyze_tags("../outputs/twitter_exporter/raindrop_format.csv")

πŸ” Key Features

OpenAI Integration

  • Model: GPT-3.5-turbo
  • Single API Call: Combines title generation + classification
  • Rate Limiting: 0.2s delay between calls
  • Cost Optimization: ~$5-10 for 1,778 bookmarks

Raindrop.io Compatibility

  • Exact Format Match: Based on actual Raindrop export
  • Required Fields: id, title, note, excerpt, url, folder, tags, created, cover, highlights, favorite
  • Folder Structure: All nested under "X/" (e.g., "X/ai", "X/devtools")
  • Tag Format: Comma-separated, includes "twitter" tag

Data Processing

  • Text Cleaning: Removes problematic newlines and quotes
  • Timestamp Conversion: ISO 8601 format (2025-07-19T18:34:39.000Z)
  • Tag Optimization: Removes tags with <5 uses
  • Content Preservation: Full text in excerpt field

πŸ“Š Results

Final Statistics (Twitter Exporter Format)

  • Bookmarks: 1,778
  • Folders: 17 (nested under "X/")
  • Tags (before cleaning): 1,247 unique, 2,669 total uses
  • Tags (after cleaning): 203 unique, 7,393 total uses
  • Average tags per bookmark: 4.2

Top Tags (After Cleaning)

  1. twitter: 1,778 uses
  2. devtools: 840 uses
  3. ai: 588 uses
  4. opensource: 328 uses
  5. GPT: 207 uses

πŸ”§ OpenAI Prompt Strategy

The AI uses a sophisticated prompt that:

  • Analyzes full text content and URL
  • References predefined folder and tag lists (from references/)
  • Suggests most appropriate folder from existing options
  • Selects relevant tags from existing list + new important ones
  • Generates concise, descriptive titles

🚨 Troubleshooting

Common Issues

Script Path Issues

  • Solution: Run scripts from project root directory
  • Alternative: Update hardcoded paths in scripts to use relative paths from script location

Raindrop.io Import Problems

  • Solution: Use cleaned CSV files from outputs/ directory
  • Cause: CSV formatting, newlines in fields, wrong headers

OpenAI API Issues

  • Rate Limits: Script includes 0.2s delays
  • Invalid Key: Use scripts/shared/setup_openai.py or check references/APIKEY.txt
  • Cost Control: Test with small subset first using -t flag

Tag Count Confusion

  • Raindrop.io reports total tag uses, not unique tags
  • Use scripts/shared/analyze_tags.py to understand the breakdown

File Issues

  • Large Files: JSON files can be 700KB-2MB, CSV files 500KB-800KB
  • Encoding: All files use UTF-8
  • Line Endings: Handled by Python CSV writer

πŸ’‘ Lessons Learned

  1. Single API Call: Combining title + classification saves ~50% on API costs
  2. Exact Format Matching: Raindrop.io is strict about CSV format
  3. Tag Cleanup: Essential for usable taxonomy (1,247 β†’ 203 tags)
  4. Text Cleaning: Critical for CSV compatibility
  5. Rate Limiting: Prevents API throttling
  6. Format Detection: BirdBear format has richer metadata but needs conversion

πŸ”„ Process Evolution

  1. Initial: Started with Twitter Bookmark Exporter format
  2. Manual Enhancement: User added folder and tags fields
  3. V1: Basic conversion with timestamp fixes
  4. V2: Added OpenAI title generation
  5. V3: Added folder/tag reclassification (separate API calls)
  6. V4: Optimized to single API call per bookmark
  7. V5: Multiple CSV format attempts to fix Raindrop.io compatibility
  8. V6: Final format matching Raindrop export structure
  9. V7: Added BirdBear format support with auto-detection
  10. Final: Tag cleanup for better taxonomy

πŸ“ Manual Steps Required

Twitter Bookmark Exporter Format

  1. Export bookmarks using Twitter Bookmark Exporter β†’ sources/twitter_exporter/twitter_bookmarks.json
  2. (Optional) Manually add "folder" and "tags" fields β†’ sources/twitter_exporter/twitter_bookmarks_tagged_full.json
  3. Obtain OpenAI API key
  4. Place key in references/APIKEY.txt or set environment variable
  5. Run conversion script
  6. Clean tags (optional but recommended)
  7. Import final CSV to Raindrop.io

BirdBear Format

  1. Export bookmarks using BirdBear β†’ sources/birdbear/birdbear-export-*.json
  2. Obtain OpenAI API key
  3. Place key in references/APIKEY.txt or set environment variable
  4. Run conversion script (auto-detects format)
  5. Clean tags (optional but recommended)
  6. Import final CSV to Raindrop.io

🎯 Future Improvements

  • Batch API calls for better efficiency
  • Support for other bookmark sources
  • Custom tag taxonomy rules
  • Automated import via Raindrop.io API
  • Progress bars for long operations
  • Update all scripts to use relative paths from project root

πŸ“ž Notes for Future Self/Agents

  • The user prefers Python3 over python
  • OpenAI API key was provided in references/APIKEY.txt due to terminal paste issues
  • Tag cleaning with min 5 uses was crucial for usability
  • Raindrop.io is very strict about CSV format - use exact export structure
  • User values cost optimization (single API call approach)
  • All folders should be nested under "X/"
  • Always add "twitter" tag to all entries
  • Two export formats exist: Twitter Bookmark Exporter (older, July) and BirdBear (newer, August)
  • Use reclassify_with_openai.py for both formats - it auto-detects

Last Updated: November 2025
Total Processing Time: ~15 minutes for 1,778 bookmarks
Estimated API Cost: $5-10 USD

About

for syncing X bookmarks to raindrop.io

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages