Skip to content

feat: Add Split Table Merging to ChartTablePDFParser#81

Merged
AdemBoukhris457 merged 1 commit intomainfrom
feature/chart_table_parser_split_table_merging
Nov 15, 2025
Merged

feat: Add Split Table Merging to ChartTablePDFParser#81
AdemBoukhris457 merged 1 commit intomainfrom
feature/chart_table_parser_split_table_merging

Conversation

@AdemBoukhris457
Copy link
Owner

Summary

Adds automatic detection and merging of tables split across pages to ChartTablePDFParser, matching functionality in StructuredPDFParser and EnhancedPDFParser.

Changes

  • Added merge_split_tables parameter and configuration options
  • Integrated SplitTableDetector for split table detection
  • Skip individual segments when tables are merged
  • Save merged table images and support VLM extraction
  • Updated documentation in user guide and API reference

Technical Details

Uses the same two-phase detection algorithm (proximity + structural validation) as other parsers for consistency.

Add automatic detection and merging of tables split across pages to ChartTablePDFParser. Includes configuration parameters, merged table processing with VLM support, and comprehensive documentation.
@AdemBoukhris457 AdemBoukhris457 self-assigned this Nov 15, 2025
@AdemBoukhris457 AdemBoukhris457 added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 15, 2025
@AdemBoukhris457 AdemBoukhris457 merged commit 2f3d8d2 into main Nov 15, 2025
1 check passed
@AdemBoukhris457 AdemBoukhris457 deleted the feature/chart_table_parser_split_table_merging branch November 15, 2025 09:52
@AdemBoukhris457 AdemBoukhris457 restored the feature/chart_table_parser_split_table_merging branch November 15, 2025 16:04
AdemBoukhris457 added a commit that referenced this pull request Nov 15, 2025
Features:
- Add PaddleOCRVL PDF parser with restoration and split table merging (#82)
- Add split table merging support to EnhancedPDFParser (#80)
- Add split table merging support to ChartTablePDFParser (#81)
@AdemBoukhris457 AdemBoukhris457 deleted the feature/chart_table_parser_split_table_merging branch November 15, 2025 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant