Extracted tables are output as text and as table #3971
Unanswered
jicastillow
asked this question in
Looking for help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello there,
I'm trying PyMuPDF for extracting text and tables from a document. However i've noticed that invoking to_markdown method leads to the following output for this section of the document:
First the raw text extracted contents of the table:
DEL 28 SEP AL 16 DEL 21 AL 30
DIC2024 DIC2024
HOTELES Y MOTONAVES
TPL TPL SGL SGL
DBL DBL
LUJO MOTONAVE: KAHILA/ PLUS JAMILA/ NILE MARQUISE/
ZEINA/ BLUE SHADOW /
(L2+) IBEROTEL CROWN EMPRESS
ROYAL RUBY/CONCERTO I O SIMILAR
HOTEL CAIRO: 1044$ 1491$ 1240$ 1927$ SEMIRAMIS INTERCONTINENTAL / CAIRO MARRIOTT HOTEL/ INTERCONTINENTAL CITY STARS/ HOLIDAY INN MAADI / DUSIT THANI RESORT
And then the correctly formatted table contents:
Is there some way to avoid outputting the raw text part, since I just need the formatted table in MD.
Thanks in advance.
Regards.
Beta Was this translation helpful? Give feedback.
All reactions