-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
We want to speedup the evaluation runtime:
- Parallelize the computation of evaluation metrics. This can be easily done for those metrics where each sample can be computed independently from each other.
- Redesign the data loading to pre-fetch and cache the ground truth and prediction documents from the disk. The loaded
DoclingDocumentobjects can be then shared across all evaluators, instead of having each evaluator reloading the data. This is particularly useful when we run on slow storage, where I/O time dominates. - Refactor the API/CLI to allow the computation of multiple modalities in one go. This is essential to take advantage of the point 2 as it maximizes the usage of the pre-loaded data.
For the computation of the evaluation metrics, we want:
- Parallelize
TableEvaluatorto compute TEDS scores of each table independently. - Parallelize
PixelLayoutEvaluatorto compute confusion matrices and metrics for each page independently. - Parallelize
MarkDownTextEvaluatorto compute text metrics for each page independently.
...
For the optimizations in the data loading, we want:
- Load and cache ground truth
DoclingDocumentobjects from aparquetdataset. - Load and cache prediction
DoclingDocumentobjects from aparquetdataset. - Load and cache prediction
DoclingDocumentobjects from externally provided files (dt,json,yaml, etc.) - Refactor the
evaluate()method and the CLI to receive multiple modalities.
Metadata
Metadata
Assignees
Labels
No labels