Replies: 1 comment 1 reply
-
As a first step, consider switching to pypdf, as PyPDF2 is EOL. Further recommendations are hard to give in general, but it might help to know the approximated location on the page, combined with a text visitor function to only retrieve the relevant values. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, working on extracting dates from PDF's, using PyPDF2 and re (for adding exceptions). What are some clever ways to increase the accuracy of strings extracted? Running into issues getting false positives and some dates being extracted that aren't even in the PDF to begin with.
Beta Was this translation helpful? Give feedback.
All reactions