Zotero 7 plugin that converts selected PDF/image attachments to Markdown using the Mistral OCR API and exports .md files to a chosen directory.
- One-click OCR to Markdown for PDFs and images
- Batch processing with progress and cancel
- Page selection (all, first N, range, list)
- Filename templating with Better BibTeX citekey fallback
- Optional image extraction placeholders
- Local cache for repeated runs
- Zotero 7.x
- Mistral API key (
mistral-ocr-latest)
- In Zotero: Tools -> Plugins
- Gear icon -> Install Add-on From File...
- Select
dist/ZotPDF2md-0.1.0.xpi - Restart Zotero
- Create a file in your Zotero profile extensions folder named
zotpdf2md@zotero.org - File contents should be the absolute path to this repo, e.g.:
/Users/you/workspace/ZotPDF2md - Restart Zotero
Open Zotero Preferences -> ZotPDF2md:
- Set your Mistral API key
- Choose an export directory
- Adjust page selection and filename template as needed
- Select one or more PDF/image attachments
- Right-click -> OCR to Markdown (Mistral)
- Markdown files are written to the export directory
{attachmentBasename}{attachmentFilename}{itemTitle}{year}{citekey}(Better BibTeX if available, else item key){itemKey}{attachmentKey}
- Pages are joined without page headers
- Export paths are validated before processing
- The API key is stored in Zotero preferences (not in this repo)
There is a standalone script to test OCR directly:
MISTRAL_API_KEY=... python3 scripts/mistral_ocr.py "/path/to/file.pdf"Optional pages:
MISTRAL_API_KEY=... python3 scripts/mistral_ocr.py "/path/to/file.pdf" --pages "[0,1,2]"