The Japanese PDF is garbled. #345
Unanswered
masaoy0730
asked this question in
Q&A
Replies: 1 comment 3 replies
-
Please provide the PDF in question, and a clear comparison what's wrong in the output and what you would expect instead. That said, the pdfium update in v4.30.1 had introduced a text extraction bug (see #336) – I don't know whether this is the same issue or unrelated, but could you check if this works correctly with v4.30.0 or v5.0.0b1 ? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I try to extract text from the Japanese PDFs, but encounter that they are garbled.
I use pypdfium2 4.30.1, pdfium 133.0.6899.0.
The attached image is a reference.
As I use the same file on Azure AI Search, it is not garbled. So, I think that the PDFs do not have any problem.
Beta Was this translation helpful? Give feedback.
All reactions