Skip to content

Comments

feat: Implementation of table structure conversion from CVAT to DoclingDocument#151

Merged
cau-git merged 15 commits intomainfrom
dev/cvat_table_support
Sep 30, 2025
Merged

feat: Implementation of table structure conversion from CVAT to DoclingDocument#151
cau-git merged 15 commits intomainfrom
dev/cvat_table_support

Conversation

@maxmnemonic
Copy link
Member

@maxmnemonic maxmnemonic commented Sep 4, 2025

Lots of new features and bugfixes for CVAT to DoclingDocument translation.

  • Conversion of table structure annotation, including cell merges and headers
  • Support for regular table cells and rich table cells
  • Tolerate missing rows in tables
  • Paragraphs and other elements sandwiched between list items now follow in-line as list-item children to preserve reading-order
  • Fixes a bug where Picture and Table elements were not included when a to_caption link is present
  • Few other minor fixes to CVAT utils
  • Enable furniture output in HTML visualization for GT/prediction datasets

@maxmnemonic maxmnemonic requested a review from cau-git September 4, 2025 11:47
@maxmnemonic maxmnemonic self-assigned this Sep 4, 2025
@maxmnemonic maxmnemonic added the enhancement New feature or request label Sep 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

DCO Check Passed

Thanks @maxmnemonic, all your commits are properly signed off. 🎉

@mergify
Copy link

mergify bot commented Sep 4, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@maxmnemonic maxmnemonic force-pushed the dev/cvat_table_support branch 3 times, most recently from f543463 to 67bca3e Compare September 5, 2025 17:58
@maxmnemonic maxmnemonic changed the title WIP: Implementation of table structure conversion from CVAT to DoclingDocument feat: Implementation of table structure conversion from CVAT to DoclingDocument Sep 5, 2025
@maxmnemonic maxmnemonic marked this pull request as ready for review September 5, 2025 18:01
@maxmnemonic maxmnemonic force-pushed the dev/cvat_table_support branch from 67bca3e to 4cb8bb7 Compare September 8, 2025 13:31
cau-git and others added 4 commits September 9, 2025 14:27
…g support. Solve image embedding issues with PDFs

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
…ngDocument

merged with cau/add-granitedocling-support

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
cau-git and others added 11 commits September 12, 2025 14:07
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
…es, adjusted threshold for is_bbox_within

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
…n case of missing structure annotation

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
@cau-git cau-git merged commit 208cd14 into main Sep 30, 2025
9 of 10 checks passed
@cau-git cau-git deleted the dev/cvat_table_support branch September 30, 2025 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants