Get text between bookmarks #2177
Replies: 2 comments 2 replies
-
If I am understanding correctly, you just need to extract the text with more detail. You seem to have been using |
Beta Was this translation helpful? Give feedback.
-
Indeed, I am using doc[0].get_textpage().extractBLOCKS() and, if I understand correctly, doc[0].get_textpage().extractDICT() returns each line of the page with bbox coordinates. Thank you |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am looking for a way to extract text between two bookmarks in a document.
I have tried using a loop to go through each page and search for the exact match of a bookmark within each block of the page, but some bookmarks are within blocks of text (e.g. "Sample\nWe collected..." where Sample is a bookmark ).
Currently, the only solution I can think of is to split all blocks based on "\n" and look for the exact match of a bookmark in each one.
But is there an easier way to extract text between bookmarks using PyMuPDF?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions