Skip to content

leauny/ZoteroTransnator

Repository files navigation

ZoteroTransnator - PDF Intelligent Translation Plugin

A powerful Zotero plugin that automatically recognizes PDF content and generates bilingual notes

Zotero TypeScript License

English | 简体中文

🚀 Development Status

Current Version: v0.1.0 (Alpha) | Progress: 85% | Core Features Complete

Completed:

  • Complete OCR recognition system (PaddleOCR-VL integration)
  • Bilingual translation services (DeepL + OpenAI)
  • Action strategy pattern architecture
  • Content processing pipeline
  • Bilingual note generation
  • Configuration management system
  • UI components (menus, dialogs, progress bars)
  • Cost estimation feature
  • Batch processing support
  • Complete preferences interface

🚧 In Progress:

  • Unit and integration tests
  • Error handling optimization
  • Performance optimization and caching

📋 Planned:

  • Multi-language translation support (Japanese, Korean, etc.)
  • Custom template system
  • Cloud configuration sync

✨ Features

  • 🔍 Smart OCR Recognition - Uses Baidu PaddleOCR-VL to recognize PDF content, supports text, headings, formulas, tables, images, and more
  • 🌐 Multi-Service Translation - Supports DeepL and OpenAI translation services with flexible switching
  • 📝 Bilingual Notes - Automatically generates bilingual Markdown notes with original text + translation, preserving formatting
  • 🧮 Formula Explanation - Uses LLM to explain mathematical formulas (optional)
  • 💰 Cost Estimation - Displays estimated costs before processing to avoid unexpected expenses
  • 📊 Batch Processing - Supports processing multiple PDF files at once
  • ⚙️ Flexible Configuration - Rich configuration options to meet different needs
  • 🌍 Multi-language Support - UI supports Chinese and English switching

🎥 Demo

Demo videos and screenshots coming soon

📦 Installation

Method 1: Install from Release (Recommended)

  1. Download the latest .xpi file from Releases
  2. Open Zotero → Tools → Add-ons
  3. Click the gear icon in the top right → Install Add-on From File
  4. Select the downloaded .xpi file

Method 2: Build from Source

# Clone the repository
git clone https://github.com/leauny/ZoteroTransinator.git
cd ZoteroTransinator

# Install dependencies
npm install

# Build the plugin
npm run build

# Find the generated .xpi file in the build directory

🔧 Configuration

1. OCR Service Configuration

The plugin uses Baidu PaddleOCR-VL service, API application required:

  1. Visit Baidu AI Cloud
  2. Enable the "Document Understanding" service
  3. Obtain API URL and Token
  4. Fill in the plugin settings

Example Configuration:

API URL: https://aip.baidubce.com/rpc/2.0/ai_custom/v1/doc_convert
API Token: 24.xxx...

2. Translation Service Configuration

Option A: Use DeepL (Recommended for document translation)

  1. Visit DeepL API
  2. Register and obtain API Key
  3. Select "DeepL" in plugin settings
  4. Fill in API Key
  5. Choose whether to use free API

Cost: Free version 500,000 characters/month, paid version $20/million characters

Option B: Use OpenAI (Supports formula explanation)

  1. Visit OpenAI or compatible API providers
  2. Obtain API Key
  3. Select "OpenAI" in plugin settings
  4. Fill in API Endpoint and Key
  5. Select model (e.g., gpt-4, gpt-3.5-turbo)

Cost: Billed per token, refer to provider pricing

3. Advanced Options

  • Translate Tables: Whether to translate table content (may incur higher costs)
  • Explain Formulas: Use LLM to explain mathematical formula meanings (requires OpenAI)
  • Skip Formulas: Skip formulas without processing
  • Batch Size: Number of content blocks per batch (affects speed and stability)
  • Timeout: API request timeout
  • Max Concurrency: Number of simultaneous API requests

4. UI Options

  • Show Cost Estimate: Display estimated cost before processing
  • Confirm Before Process: Show confirmation dialog

📖 Usage

Process Single PDF

  1. Select a PDF attachment in Zotero library
  2. Right-click and select "Convert and Translate PDF"
  3. Review cost estimate (if enabled) and confirm
  4. Wait for processing to complete
  5. Generated bilingual note will automatically open in sidebar

Batch Processing

  1. Hold Ctrl/Cmd and select multiple PDF attachments
  2. Right-click and select "Batch Convert and Translate"
  3. Confirm batch processing
  4. Monitor progress dialog
  5. Review statistics after completion

View Generated Notes

Example of generated note format:

# Introduction

> Introduction

This paper presents a novel approach...

> This paper presents a novel method...

## Background

_Background_

### Mathematical Formula

$$E = mc^2$$

> This is Einstein's mass-energy equation, expressing that energy equals mass times the speed of light squared...

🏗️ Architecture Design

Core Modules

ZoteroTransnator/
├── OCR Service          - PDF recognition service
├── Translation Service  - Translation service (DeepL/OpenAI)
├── Action Pattern       - Content processing strategies
│   ├── TranslateAction
│   ├── ExplainFormulaAction
│   └── KeepOriginalAction
├── Content Processor    - Content processing coordinator
├── Note Generator       - Note generator
└── UI Components        - User interface
    ├── Context Menus
    ├── Progress Dialog
    └── Cost Estimate Dialog

Data Flow

PDF File
   ↓
[OCR Recognition] → ContentBlock[]
   ↓
[Action Processing] → ProcessResult[]
   ↓
[Markdown Generation] → Bilingual Text
   ↓
[HTML Conversion] → Formatted Content
   ↓
[Create Note] → Zotero Note

🛠️ Development

Requirements

  • Node.js 16+
  • npm 8+
  • Zotero 7.x (recommended)

Development Commands

# Install dependencies
npm install

# Development mode (auto hot reload)
npm run start

# Build production version
npm run build

# Code check and formatting
npm run lint:check
npm run lint:fix

# Run tests
npm run test

# Release version
npm run release

Project Structure

src/
├── types/              # TypeScript type definitions
│   ├── content.d.ts   # Content blocks and processing context
│   ├── action.d.ts    # Action interfaces
│   ├── translation.d.ts  # Translation service interfaces
│   └── ocr.d.ts       # OCR types
├── utils/              # Utility functions
│   ├── apiClient.ts   # HTTP client
│   ├── configManager.ts  # Configuration manager
│   └── prefs.ts       # Preferences
├── modules/            # Core modules
│   ├── actions/       # Action strategy implementations
│   │   ├── base.ts
│   │   ├── translateAction.ts
│   │   ├── explainFormulaAction.ts
│   │   └── keepOriginalAction.ts
│   ├── translationService/  # Translation services
│   │   ├── deeplService.ts
│   │   ├── openaiService.ts
│   │   └── index.ts (factory)
│   ├── ui/            # UI components
│   │   ├── dialogs.ts
│   │   └── menuItems.ts
│   ├── ocrService.ts      # OCR service
│   ├── contentProcessor.ts  # Content processor
│   ├── noteGenerator.ts     # Note generator
│   └── pdfProcessor.ts      # PDF processing main flow
└── hooks.ts           # Plugin lifecycle hooks

addon/
├── content/           # XUL interface files
│   └── preferences.xhtml  # Preferences interface
└── locale/            # Localization files
    ├── en-US/
    └── zh-CN/

typings/              # Global type definitions
doc/                  # Detailed documentation

Core Architecture

Action Strategy Pattern:

ContentType  ActionFactory  IContentAction  ProcessResult

Service Abstraction Layer:

ITranslationService
├── DeepLTranslationService
└── OpenAITranslationService

Processing Flow:

PDF → OCR Recognition → Content Chunking → Action Processing → Note Generation → Zotero Storage

Adding New Actions

// 1. Create new Action class
export class MyCustomAction extends BaseAction {
  async process(
    block: ContentBlock,
    context: ProcessContext,
  ): Promise<ProcessResult> {
    // Implement processing logic
    return {
      success: true,
      processed: "Processed content",
      format: "quote",
      cost: 0,
    };
  }
}

// 2. Register in src/modules/actions/index.ts
export function registerDefaultActions(): void {
  factory.registerAction("custom-type", MyCustomAction);
}

Adding New Translation Services

// 1. Implement ITranslationService interface
export class MyTranslationService implements ITranslationService {
  async translate(
    texts: string[],
    options: TranslationOptions,
  ): Promise<string[]> {
    // Implement translation logic
  }

  async estimateCost(textCount: number, charCount: number): Promise<number> {
    // Implement cost estimation
  }

  async testConnection(): Promise<boolean> {
    // Implement connection test
  }
}

// 2. Register in factory
TranslationServiceFactory.createService("my-service", config);

📊 Performance

Processing Speed

  • OCR Recognition: ~2-5 seconds/page (depends on content complexity)
  • Translation: ~1-3 seconds/batch (10 text blocks)
  • Note Generation: <1 second

Cost Estimation

Example: A 10-page English paper

Item Quantity Cost (DeepL)
OCR 10 pages Free (API provider pricing)
Translation ~5000 characters $0.10
Total - ~$0.10

🐛 Troubleshooting

Issue: Right-click menu not showing

Solutions:

  1. Check if plugin is correctly installed and enabled
  2. Restart Zotero
  3. Check error console (Tools → Developer → Error Console)

Issue: OCR recognition fails

Possible Causes:

  • API URL or Token incorrect
  • PDF file corrupted or encrypted
  • Network connection issues

Solutions:

  1. Verify API configuration
  2. Test API connection
  3. Check if PDF file opens normally

Issue: Translation fails

Possible Causes:

  • Invalid or expired API Key
  • Quota limit exceeded
  • Text too long

Solutions:

  1. Verify API Key
  2. Check account balance/quota
  3. Reduce batch size

🤝 Contributing

Contributions of code, issue reports, or suggestions are welcome!

How to Contribute

  1. Fork this repository
  2. Create feature branch (git checkout -b feature/AmazingFeature)
  3. Commit changes (git commit -m 'Add some AmazingFeature')
  4. Push to branch (git push origin feature/AmazingFeature)
  5. Open Pull Request

Development Guidelines

  • Follow TypeScript best practices
  • Add appropriate comments and documentation
  • Ensure code passes ESLint checks
  • Maintain consistent code style (use Prettier)
  • Add test cases for new features

📅 Roadmap

v0.1.0 (Current Version) - Core Features MVP ✅

  • OCR recognition system
  • Bilingual translation services (DeepL + OpenAI)
  • Basic Action strategies
  • Configuration management
  • UI components and menus
  • Batch processing
  • Cost estimation

v0.2.0 - Stability and Optimization (Expected 2026 Q1)

  • Complete unit test coverage
  • Integration test framework
  • Enhanced error handling
    • Detailed error messages
    • Optimized auto-retry mechanism
    • Failure recovery strategies
  • Performance optimization
    • Caching mechanism (translation result cache)
    • Optimized concurrency control
    • Improved memory management
  • Logging system
    • Structured logging
    • Log level control
    • Export log files

v0.3.0 - Feature Enhancement (Expected 2026 Q2)

  • Multi-language Support
    • Japanese, Korean, French, German, etc.
    • Automatic language detection
    • Multi-target language translation
  • Advanced Content Processing
    • Code block recognition and syntax highlighting
    • Citation link processing
    • Footnote and reference handling
    • Image OCR (text in images)
  • Note Enhancement
    • Custom note templates
    • Export to multiple formats (PDF, Word, HTML)
    • Note tags and categorization
  • More Translation Services
    • Google Translate API
    • Azure Translator
    • Other LLMs (Claude, Gemini)

v0.4.0 - Collaboration and Sync (Expected 2026 Q3)

  • Cloud Features
    • Cloud configuration sync
    • Translation history
    • Glossary management
  • Team Collaboration
    • Shared glossaries
    • Translation template sharing
    • Annotation and commenting features
  • API and Integration
    • REST API interface
    • Webhook support
    • Integration with other tools (Notion, Obsidian)

v1.0.0 - Official Release (Expected 2026 Q4)

  • Complete documentation and tutorials
  • Video demos and user guides
  • Multi-platform testing (Windows, macOS, Linux)
  • Performance benchmarks
  • User feedback collection and improvements
  • Official release and promotion

Long-term Planning

  • AI Enhancement
    • Intelligent summary generation
    • Keyword extraction
    • Topic classification
    • Related literature recommendations
  • Advanced Analytics
    • Literature analysis tools
    • Statistics and visualization
    • Trend analysis
  • Mobile Support
    • iOS app
    • Android app
    • Mobile note sync

📝 Changelog

v0.1.0 (2026-01-09)

  • 🎉 Initial release
  • ✨ OCR recognition feature
  • ✨ Bilingual translation (DeepL + OpenAI)
  • ✨ Batch processing support
  • ✨ Configuration management system
  • ✨ UI components and dialogs
  • ✨ Cost estimation feature

🐛 Known Issues

High Priority

  • Large PDF files (>100 pages) may timeout
  • DeepL free API limitation handling

Medium Priority

  • Complex table recognition accuracy needs improvement
  • Formula explanations occasionally imprecise
  • Batch processing progress update delay

Low Priority

  • Preferences interface UI optimization
  • Error messages need to be more user-friendly

💡 FAQ

Q: What types of PDFs are supported?

A: All standard PDF files are supported, including scanned and text-based versions. Scanned versions require OCR recognition.

Q: How long does OCR recognition take?

A: Typically 2-5 seconds per page, depending on content complexity and network speed.

Q: How are translation costs calculated?

A:

  • DeepL: Charged by character count, free version 500,000 characters/month
  • OpenAI: Charged by token, refer to model pricing

Q: Is offline use supported?

A: Currently requires network connection to call OCR and translation APIs. Offline mode may be added in the future.

Q: How to handle recognition errors?

A: Generated notes can be manually edited for corrections. It's recommended to enable "Confirm Before Process" option in settings.

Q: Are other language translations supported?

A: Currently mainly supports English to Chinese translation. Other language support will be added in v0.3.0.

🙏 Acknowledgments

📄 License

This project is licensed under AGPL-3.0.


If this project helps you, please give it a ⭐️ Star!

Welcome to submit Issues and Pull Requests!

About

Zotero translate terminator !

Resources

License

Stars

Watchers

Forks

Packages

No packages published