Extracting text contained inside a rect using quads #1340
-
Hi, I was wondering if there is any way to extract any words contained within the area of a rect object or using coordinates in a PDF file. Thank you in advance for your help, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
Was able to solve it by looking for words within coordinate interval. |
Beta Was this translation helpful? Give feedback.
-
Oops - sorry: I forgot I already have that said function included 🙄 - you can do this: for word in words:
bbox = fitz.Rect(word[:4]) # make bbox from subtuple
if bbox in quad:
print("word '%s' in quad" % w[4]) |
Beta Was this translation helpful? Give feedback.
-
I have just updated the documentation: you will find my previous comments in the section for class |
Beta Was this translation helpful? Give feedback.
I have just updated the documentation: you will find my previous comments in the section for class
Quad
.