Release v0.9.7 · AdemBoukhris457/Doctra

🚀 What's new in v0.9.7

PaddleOCR-VL PDF Parser (with Restoration + Split Tables): New PaddleOCRVL-powered PDF parser that combines layout-aware OCR, visual-language understanding, page restoration, and split table merging in a single high-level pipeline.
Split Table Merging Everywhere: Split table detection & merging is now available across ChartTablePDFParser and EnhancedPDFParser, so multi-page tables are reconstructed consistently whether you’re extracting text, tables, or charts.
Restoration-Friendly Flow: The new parser plays nicely with restoration steps (denoising, deblurring, cleanups), improving OCR and structure extraction on noisy reports and scanned PDFs.
Docs Upgrade: Documentation updated to explain when to use the new PaddleOCR-VL parser, how split table merging works across parsers, and how to configure these features in real-world workflows.

✅ Motivation

Doctra is increasingly used on messy, real-world PDFs where tables are split across pages and visual context matters (charts, complex layouts, degraded scans). This release focuses on:

Making split-table merging a first-class feature across multiple parsers.
Introducing a PaddleOCR-VL–based parser that can better understand visual + textual context.
Tightening the integration with restoration so that users get more reliable structured outputs from imperfect documents.

🛠 What’s Changed

feat: Add PaddleOCRVL PDF parser with restoration and split table merging by @AdemBoukhris457 in #82
feat: Add split table merging to ChartTablePDFParser by @AdemBoukhris457 in #81
feat: Add split table merging support to EnhancedPDFParser by @AdemBoukhris457 in #80
docs: Document new PaddleOCR-VL parser & split-table merging behavior (usage, configuration, and examples)

📦 Version

v0.8.0 → v0.9.7
Minor feature-focused release that extends split-table merging across parsers and introduces a PaddleOCR-VL–powered PDF parser with restoration support. No breaking changes to the public API — existing workflows keep working, but gain access to smarter parsing options.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.7

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🚀 What's new in v0.9.7

✅ Motivation

🛠 What’s Changed

📦 Version

Contributors

Uh oh!