Skip to content

BEST8OY/Auto-Image-Occlusion-Anki-Addon

Repository files navigation

Auto Image Occlusion - Anki Addon

Automatically detect and occlude text regions in images using Tesseract OCR

Anki Version Python License

Automatically detect text regions in images and create Image Occlusion shapes with a single click. Works seamlessly with Anki's native Image Occlusion feature (Anki 25.09+).

Inspired by: logseq-anki-sync


✨ Features

  • πŸͺ„ One-Click Detection: Auto-detect text regions with a single button click
  • 🎨 Native Integration: Seamlessly integrates with Anki's Image Occlusion toolbar
  • ⌨️ Keyboard Shortcut: Quick access via Ctrl+Shift+A
  • 🧠 Smart Detection: Line-based detection with PSM 12 (sparse text with OSD)
  • 🎯 Collision Detection: Automatically skips existing occlusions
  • πŸ“ Text Length Filtering: Intelligently filters based on average line length
  • πŸ”§ Configurable: Adjust confidence, size thresholds, and filters
  • πŸš€ Persistent UI: Button automatically reappears when selecting new images
  • 🐍 Python Backend: Uses pytesseract for reliable, fast OCR processing

πŸ“¦ Installation

Prerequisites

1. Anki 25.09 or Later

Ensure you have Anki 25.09+ which includes native Image Occlusion support.

2. Tesseract OCR

Install Tesseract OCR on your system:

Linux:

sudo apt-get install tesseract-ocr

macOS:

brew install tesseract

Windows:

  1. Download installer from GitHub releases
  2. Run installer and note the installation path
  3. Add to PATH: C:\Program Files\Tesseract-OCR

2.1. Additional Language Data (Optional)

Tesseract supports 100+ languages. To use non-English languages, you need to download additional language data files.

Download Language Data:

  • Visit tessdata repository or tessdata_best repository
  • Download .traineddata files for your language(s)
  • Common examples: spa.traineddata (Spanish), fra.traineddata (French), deu.traineddata (German), chi_sim.traineddata (Chinese Simplified)

Install Language Data:

Windows:

Copy .traineddata files to: C:\Program Files\Tesseract-OCR\tessdata\

Linux (Package Install):

# Option 1: Install via package manager
sudo apt-get install tesseract-ocr-spa  # Spanish
sudo apt-get install tesseract-ocr-fra  # French

# Option 2: Manual install
sudo cp *.traineddata /usr/share/tesseract-ocr/tessdata/
# or: /usr/share/tessdata/

macOS (Homebrew):

# Option 1: Install via brew
brew install tesseract-lang  # All languages

# Option 2: Manual install
cp *.traineddata /usr/local/share/tessdata/
# or: /opt/homebrew/share/tessdata/ (Apple Silicon)

Verify Installation:

tesseract --list-langs

Example Config for Spanish:

{
    "tesseract_lang": "spa"
}

Example Config for Multiple Languages:

{
    "tesseract_lang": "eng+spa+fra"
}

Install Addon

Method 1: AnkiWeb (Recommended)

  1. Go to Tools β†’ Add-ons
  2. Click Get Add-ons...
  3. Enter code: 1414192727
  4. Restart Anki

Method 2: Manual Installation

  1. Download or clone this repository
  2. Copy the entire folder to your Anki addons directory:
    • Windows: %APPDATA%\Anki2\addons21\auto_image_occlusion
    • macOS: ~/Library/Application Support/Anki2/addons21/auto_image_occlusion
    • Linux: ~/.local/share/Anki2/addons21/auto_image_occlusion
  3. Restart Anki

πŸš€ Quick Start

  1. Open Add Cards: In Anki, click Add or press A
  2. Select Image Occlusion: Choose the Image Occlusion note type
  3. Load Your Image: Click the image icon and select your image
  4. Auto-Detect: Click the magic wand button (πŸͺ„) or press Ctrl+Shift+A
  5. Wait: OCR processing takes 2-10 seconds depending on image size
  6. Review: Automatically created occlusion boxes appear on text regions
  7. Adjust: Move, resize, or delete boxes as needed
  8. Add: Click "Add" to create your cards

Visual Guide

outputfile.mp4

βš™οΈ Configuration

Access Config

Tools β†’ Add-ons β†’ Auto Image Occlusion Detection β†’ Config

Default Configuration

{
    "tesseract_lang": "eng",
    "min_confidence": 48,
    "min_width": 4,
    "min_height": 4,
    "min_area_percent": 0.0001,
    "button_shortcut": "Ctrl+Shift+A",
    "vertical_merge_factor": 0.65
}

Configuration Options

Option Default Description
tesseract_lang "eng" OCR language code(s). Use "eng+fra" for multiple languages
min_confidence 48 Minimum OCR confidence (0-100). Lower = more detections
min_width 4 Minimum box width in pixels
min_height 4 Minimum box height in pixels
min_area_percent 0.0001 Minimum box area as % of image (0.01 = 1%)
button_shortcut "Ctrl+Shift+A" Keyboard shortcut for auto-detection
vertical_merge_factor 0.65 Merge lines within 0.65x average height (handles multi-line labels)

Configuration Examples

For Higher Quality (fewer false positives):

{
    "min_confidence": 60,
    "min_area_percent": 0.001
}

For More Detections (catch more text):

{
    "min_confidence": 35,
    "min_area_percent": 0.00005
}

For Non-English Text (e.g., Spanish):

{
    "tesseract_lang": "spa",
    "min_confidence": 48
}

For Mixed Languages (e.g., English + Chinese):

{
    "tesseract_lang": "eng+chi_sim",
    "min_confidence": 40
}

Disable multi-line merging (treat each line separately):

{
    "vertical_merge_factor": 0
}

More aggressive multi-line merging:

{
    "vertical_merge_factor": 2.5
}

🧠 How It Works

Uses Tesseract's PSM 12 (sparse text with OSD) with line-based grouping for reliable detection.

Best for:

  • βœ… Scattered text elements (diagrams, labels)
  • βœ… Dense text documents
  • βœ… Books and articles
  • βœ… Mixed layouts with varied text positioning
  • βœ… Anatomy diagrams
  • βœ… Flowcharts and infographics

Detection Process:

  1. Detects text using sparse text detection (PSM 12 β€” sparse text with OSD)
  2. Groups words by text line for granular detection
  3. Merges vertically adjacent lines
  4. Calculates average text length for intelligent filtering
  5. Filters by confidence threshold (min 48)
  6. Filters by text length (ignores lines shorter than avg/2 or 3 chars)
  7. Detects collisions with existing occlusions (backend & frontend)
  8. Creates individual occlusions per text block

Technical Details:

  • Uses PSM 12 (sparse text with OSD) - optimized for finding scattered text
  • Line-based grouping provides reliable granularity
  • Vertical merging handles multi-line labels (e.g., anatomy diagrams)
  • Each text block (single or multi-line) becomes a separate occlusion
  • Collision detection prevents duplicate occlusions

πŸ—οΈ Architecture

Module Structure

anki addon/
β”œβ”€β”€ __init__.py                 # Package initialization
β”œβ”€β”€ addon.py                    # Main entry point, registers hooks
β”œβ”€β”€ editor_integration.py       # JavaScript injection logic
β”œβ”€β”€ js_builder.py               # JavaScript code generator
β”œβ”€β”€ message_handler.py          # Python ↔ JavaScript communication
β”œβ”€β”€ ocr_engine.py               # Tesseract OCR wrapper (PSM 12, line-based)
β”œβ”€β”€ config.json                 # Default configuration
β”œβ”€β”€ config.md                   # Configuration documentation
β”œβ”€β”€ manifest.json               # Addon metadata
└── README.md                   # This file

Data Flow

1. User opens IO note
   ↓
2. editor_did_load_note hook fires
   ↓
3. Python injects JavaScript (100ms delay)
   ↓
4. JavaScript initializes:
   - Create window.AutoIOAddon namespace
   - Intercept resetIOImageLoaded()
   - Wait for IO editor (MutationObserver)
   - Add button to toolbar
   ↓
5. User clicks button or presses Ctrl+Shift+A
   ↓
6. JavaScript:
   - Capture image element
   - Convert to base64 DataURL
   - Send via pycmd('autoDetectOCR:...')
   ↓
7. Python:
   - Decode image
   -   - Run Tesseract OCR (PSM 12 - sparse text with OSD)
   - Group text by lines
   - Calculate average text length
   - Filter by confidence, size, and text length
   - Detect collisions with existing shapes
   - Return non-colliding JSON results
   ↓
8. JavaScript:
   - Transform coordinates (image β†’ canvas)
   - Double-check overlapping regions (safety measure)
   - Create Rectangle shapes
   - Add to maskEditor
   - Redraw canvas

πŸ”§ Troubleshooting

"Auto-detection failed: OCR timeout"

Symptoms: Error message after clicking button

Solutions:

  1. βœ… Image is too large (reduce to ~1920px width)
  2. βœ… System is slow (increase timeout in JavaScript config)
  3. βœ… Tesseract not installed properly
  4. βœ… Check Anki debug console for Python errors

Tesseract Not Found

Symptoms: pytesseract.TesseractNotFoundError

Solutions:

  1. Verify Installation:
    tesseract --version
  2. Add to PATH (Windows):
    • System Properties β†’ Environment Variables
    • Add C:\Program Files\Tesseract-OCR to PATH
    • Restart Anki
  3. Reinstall Tesseract and verify during installation

No Text Detected

Symptoms: "No text regions detected" message

Solutions:

  1. βœ… Lower min_confidence (try 30-40)
  2. βœ… Lower min_area_percent (try 0.00005)
  3. βœ… Ensure image has clear, readable text
  4. βœ… Check if correct language is set (tesseract_lang)
  5. βœ… Improve image quality/contrast

Poor Detection Accuracy

Symptoms: Too many false positives or missing text

Solutions:

Too many false positives:

  • Increase min_confidence (try 55-65)
  • Increase min_area_percent (try 0.001)

Missing text:

  • Decrease min_confidence (try 35-40)
  • Decrease min_area_percent (try 0.00001)
  • Improve image quality/contrast

🀝 Contributing

Contributions are welcome! To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Test thoroughly in Anki
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

πŸ™ Credits

Inspiration

Dependencies

Icons

πŸ“„ License

GNU AGPL v3+ - Same as Anki's license

This addon is free and open-source. See Anki's license for full details.


Made with ❀️ for the Anki community

About

Auto Image Occlusion for Anki Native Image Occlusion Tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages