Skip to content

Introduce a DoclingDocument Ground Truth DatasetBuilder #79

@nikos-livathinos

Description

@nikos-livathinos
  • Introduce DoclingDocumentDatasetBuilder to build Ground Truth datasets from lossless serializations of DoclingDocument files (e.g. jsons).
  • It is useful when DoclingDocument objects have been produced not as a result of a Docling conversion pipeline but by an one-to-one translation of other annotation formats (e.g. convert WordScape annotations to DoclingDocument format).
  • It can extend the existing FileDatasetBuilder.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions