Automatically detect and occlude text regions in images using Tesseract OCR
Automatically detect text regions in images and create Image Occlusion shapes with a single click. Works seamlessly with Anki's native Image Occlusion feature (Anki 25.09+).
Inspired by: logseq-anki-sync
- πͺ One-Click Detection: Auto-detect text regions with a single button click
- π¨ Native Integration: Seamlessly integrates with Anki's Image Occlusion toolbar
- β¨οΈ Keyboard Shortcut: Quick access via
Ctrl+Shift+A - π§ Smart Detection: Line-based detection with PSM 12 (sparse text with OSD)
- π― Collision Detection: Automatically skips existing occlusions
- π Text Length Filtering: Intelligently filters based on average line length
- π§ Configurable: Adjust confidence, size thresholds, and filters
- π Persistent UI: Button automatically reappears when selecting new images
- π Python Backend: Uses pytesseract for reliable, fast OCR processing
Ensure you have Anki 25.09+ which includes native Image Occlusion support.
Install Tesseract OCR on your system:
Linux:
sudo apt-get install tesseract-ocrmacOS:
brew install tesseractWindows:
- Download installer from GitHub releases
- Run installer and note the installation path
- Add to PATH:
C:\Program Files\Tesseract-OCR
Tesseract supports 100+ languages. To use non-English languages, you need to download additional language data files.
Download Language Data:
- Visit tessdata repository or tessdata_best repository
- Download
.traineddatafiles for your language(s) - Common examples:
spa.traineddata(Spanish),fra.traineddata(French),deu.traineddata(German),chi_sim.traineddata(Chinese Simplified)
Install Language Data:
Windows:
Copy .traineddata files to: C:\Program Files\Tesseract-OCR\tessdata\
Linux (Package Install):
# Option 1: Install via package manager
sudo apt-get install tesseract-ocr-spa # Spanish
sudo apt-get install tesseract-ocr-fra # French
# Option 2: Manual install
sudo cp *.traineddata /usr/share/tesseract-ocr/tessdata/
# or: /usr/share/tessdata/macOS (Homebrew):
# Option 1: Install via brew
brew install tesseract-lang # All languages
# Option 2: Manual install
cp *.traineddata /usr/local/share/tessdata/
# or: /opt/homebrew/share/tessdata/ (Apple Silicon)Verify Installation:
tesseract --list-langsExample Config for Spanish:
{
"tesseract_lang": "spa"
}Example Config for Multiple Languages:
{
"tesseract_lang": "eng+spa+fra"
}Method 1: AnkiWeb (Recommended)
- Go to Tools β Add-ons
- Click Get Add-ons...
- Enter code:
1414192727 - Restart Anki
Method 2: Manual Installation
- Download or clone this repository
- Copy the entire folder to your Anki addons directory:
- Windows:
%APPDATA%\Anki2\addons21\auto_image_occlusion - macOS:
~/Library/Application Support/Anki2/addons21/auto_image_occlusion - Linux:
~/.local/share/Anki2/addons21/auto_image_occlusion
- Windows:
- Restart Anki
- Open Add Cards: In Anki, click Add or press
A - Select Image Occlusion: Choose the Image Occlusion note type
- Load Your Image: Click the image icon and select your image
- Auto-Detect: Click the magic wand button (πͺ) or press
Ctrl+Shift+A - Wait: OCR processing takes 2-10 seconds depending on image size
- Review: Automatically created occlusion boxes appear on text regions
- Adjust: Move, resize, or delete boxes as needed
- Add: Click "Add" to create your cards
outputfile.mp4
Tools β Add-ons β Auto Image Occlusion Detection β Config
{
"tesseract_lang": "eng",
"min_confidence": 48,
"min_width": 4,
"min_height": 4,
"min_area_percent": 0.0001,
"button_shortcut": "Ctrl+Shift+A",
"vertical_merge_factor": 0.65
}| Option | Default | Description |
|---|---|---|
tesseract_lang |
"eng" |
OCR language code(s). Use "eng+fra" for multiple languages |
min_confidence |
48 |
Minimum OCR confidence (0-100). Lower = more detections |
min_width |
4 |
Minimum box width in pixels |
min_height |
4 |
Minimum box height in pixels |
min_area_percent |
0.0001 |
Minimum box area as % of image (0.01 = 1%) |
button_shortcut |
"Ctrl+Shift+A" |
Keyboard shortcut for auto-detection |
vertical_merge_factor |
0.65 |
Merge lines within 0.65x average height (handles multi-line labels) |
For Higher Quality (fewer false positives):
{
"min_confidence": 60,
"min_area_percent": 0.001
}For More Detections (catch more text):
{
"min_confidence": 35,
"min_area_percent": 0.00005
}For Non-English Text (e.g., Spanish):
{
"tesseract_lang": "spa",
"min_confidence": 48
}For Mixed Languages (e.g., English + Chinese):
{
"tesseract_lang": "eng+chi_sim",
"min_confidence": 40
}Disable multi-line merging (treat each line separately):
{
"vertical_merge_factor": 0
}More aggressive multi-line merging:
{
"vertical_merge_factor": 2.5
}Uses Tesseract's PSM 12 (sparse text with OSD) with line-based grouping for reliable detection.
Best for:
- β Scattered text elements (diagrams, labels)
- β Dense text documents
- β Books and articles
- β Mixed layouts with varied text positioning
- β Anatomy diagrams
- β Flowcharts and infographics
Detection Process:
- Detects text using sparse text detection (PSM 12 β sparse text with OSD)
- Groups words by text line for granular detection
- Merges vertically adjacent lines
- Calculates average text length for intelligent filtering
- Filters by confidence threshold (min 48)
- Filters by text length (ignores lines shorter than avg/2 or 3 chars)
- Detects collisions with existing occlusions (backend & frontend)
- Creates individual occlusions per text block
Technical Details:
- Uses PSM 12 (sparse text with OSD) - optimized for finding scattered text
- Line-based grouping provides reliable granularity
- Vertical merging handles multi-line labels (e.g., anatomy diagrams)
- Each text block (single or multi-line) becomes a separate occlusion
- Collision detection prevents duplicate occlusions
anki addon/
βββ __init__.py # Package initialization
βββ addon.py # Main entry point, registers hooks
βββ editor_integration.py # JavaScript injection logic
βββ js_builder.py # JavaScript code generator
βββ message_handler.py # Python β JavaScript communication
βββ ocr_engine.py # Tesseract OCR wrapper (PSM 12, line-based)
βββ config.json # Default configuration
βββ config.md # Configuration documentation
βββ manifest.json # Addon metadata
βββ README.md # This file
1. User opens IO note
β
2. editor_did_load_note hook fires
β
3. Python injects JavaScript (100ms delay)
β
4. JavaScript initializes:
- Create window.AutoIOAddon namespace
- Intercept resetIOImageLoaded()
- Wait for IO editor (MutationObserver)
- Add button to toolbar
β
5. User clicks button or presses Ctrl+Shift+A
β
6. JavaScript:
- Capture image element
- Convert to base64 DataURL
- Send via pycmd('autoDetectOCR:...')
β
7. Python:
- Decode image
- - Run Tesseract OCR (PSM 12 - sparse text with OSD)
- Group text by lines
- Calculate average text length
- Filter by confidence, size, and text length
- Detect collisions with existing shapes
- Return non-colliding JSON results
β
8. JavaScript:
- Transform coordinates (image β canvas)
- Double-check overlapping regions (safety measure)
- Create Rectangle shapes
- Add to maskEditor
- Redraw canvas
Symptoms: Error message after clicking button
Solutions:
- β Image is too large (reduce to ~1920px width)
- β System is slow (increase timeout in JavaScript config)
- β Tesseract not installed properly
- β Check Anki debug console for Python errors
Symptoms: pytesseract.TesseractNotFoundError
Solutions:
- Verify Installation:
tesseract --version
- Add to PATH (Windows):
- System Properties β Environment Variables
- Add
C:\Program Files\Tesseract-OCRto PATH - Restart Anki
- Reinstall Tesseract and verify during installation
Symptoms: "No text regions detected" message
Solutions:
- β
Lower
min_confidence(try 30-40) - β
Lower
min_area_percent(try 0.00005) - β Ensure image has clear, readable text
- β
Check if correct language is set (
tesseract_lang) - β Improve image quality/contrast
Symptoms: Too many false positives or missing text
Solutions:
Too many false positives:
- Increase
min_confidence(try 55-65) - Increase
min_area_percent(try 0.001)
Missing text:
- Decrease
min_confidence(try 35-40) - Decrease
min_area_percent(try 0.00001) - Improve image quality/contrast
Contributions are welcome! To contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Test thoroughly in Anki
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- logseq-anki-sync - Original auto-detection concept
- Tesseract OCR - Text detection engine
- pytesseract - Python wrapper for Tesseract
- Pillow - Image processing library
- Magic wand icon from Material Design Icons (mdiAutoFix)
GNU AGPL v3+ - Same as Anki's license
This addon is free and open-source. See Anki's license for full details.
Made with β€οΈ for the Anki community