Skip to content

feat: Extend the evaluators to support external predictions stored in files#185

Merged
nikos-livathinos merged 21 commits intomainfrom
nli/external_predictions
Dec 8, 2025
Merged

feat: Extend the evaluators to support external predictions stored in files#185
nikos-livathinos merged 21 commits intomainfrom
nli/external_predictions

Conversation

@nikos-livathinos
Copy link
Member

Support external predictions according to the description given in #112

  • Extend all evaluators to support the optional parameter external_predictions_path.
  • If such a path is provided, it is used to load DoclingDocument objects from files instead of the parquet dataset.
  • GT is always taken from the parquet.
  • The path can contain files with predicted DoclingDocuments in various formats (json, doctags, yaml).
  • Update unit tests.
  • Extend CLI for docling-eval evaluate:
--external-predictions-path        PATH            Path to load existing DoclingDocument predictions. The filename must follow the pattern [doc_id].[json|dt|yaml|yml]

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…mmy entries in all evaluators.

Extend the CLI to support the --external-predictions-path

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…various formats

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…th. Add unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…d unit test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
@nikos-livathinos nikos-livathinos marked this pull request as draft December 4, 2025 16:20
@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

DCO Check Passed

Thanks @nikos-livathinos, all your commits are properly signed off. 🎉

@nikos-livathinos nikos-livathinos self-assigned this Dec 4, 2025
@mergify
Copy link

mergify bot commented Dec 4, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

🟢 Require two reviewer for test updates

Wonderful, this rule succeeded.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…it test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…dd unit test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…. Add unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…dd unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…ngOrderEvaluator. Fix main

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…oclingDocument from doctags and

the GT image.
- Introduce the staticmethod load_doctags() which covers all cases on page image loading.
- Refactor the FilePredictionProvider to use the load_doctags() from ExternalDoclingDocumentLoader.
- Refactor all evaluators to use the new ExternalDoclingDocumentLoader.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
@nikos-livathinos nikos-livathinos marked this pull request as ready for review December 8, 2025 08:30
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…sing the API and the CLI.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
@nikos-livathinos nikos-livathinos merged commit 53dbd95 into main Dec 8, 2025
10 checks passed
@nikos-livathinos nikos-livathinos deleted the nli/external_predictions branch December 8, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants