Skip to content

Commit ae38ba6

Browse files
committed
fix: update bindings for annotation extraction, WASM MIME exports, PaddleOCR validation (#403)
- Add assert_annotations helpers to all e2e generators (Go, Java, PHP, C#, Elixir, Python) - Add PdfAnnotation types to TypeScript core bindings - Add extract_annotations/margin fields to Ruby PdfConfig binding - Export detectMimeFromBytes/getExtensionsForMime from WASM TypeScript bindings - Remove stale kreuzberg-node artifacts/ directory - Make PaddleOCR backend validation dynamic via plugin registry - Fix Python tests expecting "json" to be invalid output format (now alias for "structured") - Regenerate all e2e tests and re-vendor Ruby crates
1 parent 01a06fa commit ae38ba6

File tree

26 files changed

+1000
-2023
lines changed

26 files changed

+1000
-2023
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2020
### Fixed
2121

2222
- PDF table recognition now validates column alignment, preventing body text pages from being misclassified as tables
23+
- PaddleOCR backend validation now dynamically checks the plugin registry instead of hardcoding, preventing false "backend not registered" errors when the plugin is available (#403)
24+
- WASM bindings now export `detectMimeFromBytes` and `getExtensionsForMime` MIME utility functions
25+
- Node.js NAPI-RS binding correctly exposes `annotations` field on `ExtractionResult`
26+
- Python output format validation tests updated to reflect `json` as a valid format (alias for `structured`)
2327

2428
---
2529

0 commit comments

Comments
 (0)