Inconsistency in search_for Results: Seeking Single fitz.Rect for Entire Phrases in PyMuPDF #3046
Unanswered
Mazzesy
asked this question in
Looking for help
Replies: 2 comments 2 replies
-
I haven't looked at all the details yet, But I saw that your issue happens in paragraphs having justified text alignment.
|
Beta Was this translation helpful? Give feedback.
2 replies
-
I will move this post to Discussions for further communication. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Description of the bug
I am utilizing the search_for function from the PyMuPDF library to identify the positions of phrases within a PDF document. Specifically, I am working with this document (Link).
When I search for the phrase VERORDNUNG (EU) 2022/2379, the search_for function returns a single fitz.Rect representing the entire phrase. However, when searching for Verordnung (EG) Nr. 617/2008, the function provides separate fitz.Rect instances for each part of the phrase.
My objective is to consistently obtain a single fitz.Rect for the entire phrase. Could someone please explain the reason behind this discrepancy in the function's output?
I open the same question in stack overflow (Link).
How to reproduce the bug
Pdf file that produces the inconsistency: CELEX 32022R2379 DE TXT.pdf
PyMuPDF version
1.23.14
Operating system
MacOS
Python version
3.9
Beta Was this translation helpful? Give feedback.
All reactions