Skip to content

AIwork4me/openclaw-paddleocr-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

PaddleOCR Skills for OpenClaw

License ClawHub PaddleOCR

中文文档 | English

Supercharge your OpenClaw with industry-leading PDF and image reading capabilities powered by PaddleOCR

Table of Contents


Introduction

When working with complex documents containing tables, mathematical formulas, or special layouts, AI assistants often struggle to accurately extract content. PaddleOCR Skills solves this problem by integrating the power of China's leading open-source OCR engine into OpenClaw.

With just one prompt and 3 simple steps, your OpenClaw gains professional-grade document and image parsing capabilities.

What is PaddleOCR Skills?

PaddleOCR Skills is a collection of AI skills available on ClawHub that enables OpenClaw to process documents and images with exceptional accuracy. It offers two core skills:

📄 Document Parsing

Advanced document understanding that goes beyond simple text extraction. It returns complete document structure, perfectly preserving:

  • Text content with formatting
  • Tables with cell structure
  • Mathematical formulas (with LaTeX)
  • Charts and diagrams
  • Complex layouts (multi-column, headers/footers)
  • Reading order and document structure

Best for: Academic papers, financial reports, invoices, legal documents, multi-column layouts

ClawHub: paddleocr-doc-parsing

🔍 Text Recognition

Fast and accurate text extraction from images and PDFs, returning structured JSON data perfect for developers.

Best for: Screenshots, photos, scans, simple text extraction, quick OCR tasks

ClawHub: paddleocr-text-recognition


Skills Comparison

Feature Document Parsing Text Recognition
Primary Use Complex document understanding Fast text extraction
Tables ✅ Full structure preserved ⚠️ Text only
Formulas ✅ LaTeX output
Charts/Diagrams ✅ Analyzed
Layout Analysis ✅ Complete structure
Speed Moderate Fast
Output Format Markdown + JSON JSON
Best For Academic papers, reports, invoices Screenshots, simple images

When to Use Each Skill

Use Document Parsing for:

  • Documents with tables (invoices, financial reports, spreadsheets)
  • Documents with mathematical formulas (academic papers, scientific documents)
  • Documents with charts and diagrams
  • Multi-column layouts (newspapers, magazines, brochures)
  • Any document requiring structured understanding

Use Text Recognition for:

  • Simple text-only extraction
  • Quick OCR tasks where speed is critical
  • Screenshots or simple images with clear text
  • When you need structured JSON output for processing

Why It Matters

The Problem

When you send PDFs or images with complex formatting to AI assistants, they often:

  • Lose table structure
  • Misinterpret formulas
  • Scramble multi-column layouts
  • Miss important formatting details

The Solution

PaddleOCR Skills provides:

  • Accuracy: Industry-leading OCR from PaddleOCR (80K+ GitHub stars)
  • Completeness: Preserves all document structure
  • Simplicity: One-prompt installation in OpenClaw
  • Free Tier: Official PaddleOCR API offers tens of thousands of free pages daily

Prerequisites

Before installing PaddleOCR Skills, you need:

  1. ClawHub Account - Register at clawhub.ai
  2. PaddleOCR API Access - Register at paddleocr.com

Installation Guide

Step 1: Get ClawHub Token

  1. Visit www.clawhub.ai and complete registration
  2. Navigate to SettingsCreate Tokens
  3. Generate and copy your token (starts with clh_)

Step 2: Get PaddleOCR API Credentials

  1. Visit https://www.paddleocr.com and register
  2. Click the API button
  3. Find these interfaces:
    • PaddleOCR-VL-1.5 (Document Parsing)
    • PP-OCRv5 (Text Recognition)
  4. Copy the API_URL and TOKEN for each

Note: Official free tier supports tens of thousands of pages per day!

Step 3: One-Prompt Installation

Send this prompt to your OpenClaw (replace the bracketed values):

Please install PaddleOCR skills for me:
ClawHub Token: [Your ClawHub Token, starts with clh_]
PaddleOCR API Configuration:
- Document Parsing API: [Your Document Parsing API URL]
- Text Recognition API: [Your Text Recognition API URL]
- Access Token: [Your PaddleOCR Access Token]
Execution Requirements: Please complete the following steps automatically:
Login to ClawHub → Install dependencies → Install skills → Write configuration → Run tests → Report installation status.

That's it! OpenClaw will automatically:

  1. Log in to ClawHub
  2. Install dependencies
  3. Install the skills
  4. Configure API credentials
  5. Run tests
  6. Report installation status

Resources


Connect With Us

Scan the QR code to follow our WeChat official account for more AI tips and tutorials:

WeChat QR Code


License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ for the OpenClaw community

About

PaddleOCR Skills for OpenClaw - Professional document parsing and text recognition capabilities

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors