Some contents in pdf page is not identifying either as text or image #1876
Answered
by
JorjMcKie
UdayaKUnnikrishnan
asked this question in
Looking for help
-
Pdf' page contain text that can not be copied. Such text regions is not identified as image or text by pymupdf. |
Beta Was this translation helpful? Give feedback.
Answered by
JorjMcKie
Aug 16, 2022
Replies: 1 comment 4 replies
-
You must provide an example - otherwise I cannot help. |
Beta Was this translation helpful? Give feedback.
4 replies
Answer selected by
JorjMcKie
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
You must provide an example - otherwise I cannot help.
As is typical for PDF, there are a number of possible explanations. Among them:
You can use so-called "line art" to simulate text - like a capital "D" can be drawn by a line "|" and a left-open semi-circle.
Line art is neither text nor image ...