Releases: docling-project/docling-eval
Releases · docling-project/docling-eval
v0.10.0
Feature
- Extend the CLI for create-eval to receive the vlm-options and max_new_tokens parameters when the provider is GraniteDocling (#164) (
8be2e83) - Harmonizing pic classes for cvat to docling conversion (#167) (
740157d) - Add more specific validation for reading-order, enhance validation report (
5e5f2db) - Integrate textline_cells based OCR evaluation (#156) (
3a9543c)
Fix
- Validation fixes for list item impurity check (#169) (
74e7b3e) - Don't report content-layer group violation multiple times (
cb71009) - Handle merged elements regarding inclusion, don't flag single element pages (
c10fdfd) - Missing transform to storage_scale for some items and table cells (
1eb6b4e) - More CVAT validation and docling conversion fixes (#163) (
6f59c7a) - Better control over scaling in CVAT transform, fixes for OCR (#162) (
ef17b5a) - Fixes for CVAT validation, OCR in CVAT pipeline, logging, and more (#161) (
80e449d)
Performance
v0.9.0
v0.8.1
v0.8.0
What's Changed
- feat: Extend the Consolidator to export Latex files alongside the excel report by @nikos-livathinos in #143
- feat: Extend the DoclingEvalCOCOExporter to export a parquet dataset in COCO format by @nikos-livathinos in #145
- feat: Several fixes and campaign tools extensions by @cau-git in #150
- feat: Add Table structure evaluations for TEDS by @praveenmidde in #94
Full Changelog: v0.7.0...v0.8.0
v0.7.0
v0.6.0
Feature
- Layout evaluation fixes, mode control and cleanup (#133) (
629a451) - Introduce utility to export layout predictions from HF parquet files into pycocotools format. (#125) (
54f7c81) - Add specific language support for XFUND dataset builder (#122) (
4ca6a0e) - Tooling for CVAT validation, to DoclingDocument transformation, new Evaluators (#119) (
2ee1104)
Fix
- Move ibm-cos to hyperscaler (#135) (
9aff6c1) - Update hyperscalers to support multiple image file types (#118) (
a34f264) - Misc fixes (#131) (
518e1ba) - CVAT to DoclingDoc: Ensure that nested list handling works across page boundaries (#129) (
1b58377) - Important fixes for parquet serialization / deserialization, optimizations (#128) (
53c22ef) - Fixes for the dataset visualizers (#127) (
a127ea9)
Performance
v0.5.0
v0.4.0
Feature
- Extend the FileProvider and the CLI to accept parameters that control the source of the prediction images (#111) (
42e1615) - Improvements for the MultiEvaluator (#95) (
04fe2d9) - Add extra args for docling-provider and default annotations for CVAT (#98) (
7903b6a) - Introduce SegmentedPage for OCR (#91) (
be0ff6a) - Update CVAT for multi-page annotation, utility to create sliced PDFs (#90) (
28d166d) - Add area level f1 (#86) (
54d013b)
Fix
- Small fixes (#108) (
0628fa6) - Layout text not correctly populated in AWS prediction provider, add tests (#100) (
6441688) - Dataset feature spec fixes, cvat improvements (#97) (
b79dd19) - Update boto3 AWS client to accept service credentials (#88) (
4e01d0b) - Handle unsupported END2END evaluation and fix variable name in OCR (#87) (
75311da) - Propagate cvat parameters (#82) (
1e2040a)
Documentation
Release v0.3.0
What's Changed
- feat: Update GoogleDocAIPredictionProvider to use service account creds by @samiuc in #73
- fix: Add CLI option for FileDatasetBuilder by @cau-git in #76
- feat: Consolidate multiple evaluation results and generate a comparison matrix by @nikos-livathinos in #64
- feat: OCR evaluator by @cau-git @samiuc in #63
Full Changelog: v0.2.0...v0.3.0
Release v0.2.0
What's Changed
- dev: Add DocVQA questions, more fixes by @cau-git in #58
- docs: Add README for Docling-DPBench by @cau-git in #60
- feat: Azure prediction provider by @cau-git in #50
- fix: Ensure that evaluators skip data samples without the SUCCESS, PARTIAL_SUCCESS status by @nikos-livathinos in #66
- feat: Support for S3 datasource by @praveenmidde in #65
- feat: PixParse OCR dataset builder by @cau-git in #61
- fix: Address table export_to_html deprecation by @cau-git in #67
- feat: AWS Textract and Google DocAI Prediction providers by @cau-git in #62
- feat: Refactor CVAT builder by @cau-git in #68
- fix: Address missing conversion status (PENDING), add artifacts path, remove unused CLI args by @cau-git in #69
- feat: FileDatasetBuilder by @cau-git in #70
Full Changelog: v0.1.0...v0.2.0