This repository was archived by the owner on Apr 11, 2025. It is now read-only.
forked from camelot-dev/camelot
-
Notifications
You must be signed in to change notification settings - Fork 17
How can I read the table that have started on page 1 and extends on multiple pages. #192
Copy link
Copy link
Open
Description
pypdf_table_extraction/camelot does not recognize the table on pages after page 1 with the lattice flavor.
With the stream method, I get a messed-up output like this one
0 1 2 3 4 5
0 2059001013453712313
1 289 Transakcije po nalogu građana PBO:
2 MARY MILAN
3 5 12.05.2024. 12.05.2024. n 9001013454849 III rata maj PBZ: 1.600,00
4 KNEZ MILET 456 4 11
5 Instant nalog FT241123YJFB4
6 Belgrade
This is the output from the lattice from page one which looks great
0 REDNI\nBROJ DATUM\nPRIJEMA DATUM\nIZVRŠENJA ... REFERENCA KLIJENTA\nREFERENCA PARTNERA\nREFERE... NA TERET U KORIST
1 1 11.05.2024. 12.05.2024. ... PBO:\nPBZ:\nFT201661TXR4 4.200,00
2 2 12.05.2024. 12.05.2024. ... PBO:\nPBZ:\nFT20122CK6Y6 5.600,00
3 3 12.05.2024. 12.05.2024. ... PBO:\nPBZ:\nFT20134Y5NWL 5.600,00
4 4 12.05.2024. 12.05.2024. ... PBO:\nPBZ:\nFT20124QY6JZ 5.600,00
The document is a PDF bank statement.
NOTE: I have randomized the numbers in the output for privacy and security purposes.
Metadata
Metadata
Assignees
Labels
No labels