Skip to content
Discussion options

You must be logged in to vote

Hi, spacypdfreader is a third-party package that's not maintained by us, so you might want to check out their repo / forums instead: https://github.com/SamEdwardes/spaCyPDFreader

I think this is a general problem for PDF processing, since the text isn't necessarily stored in reading order in the underlying PDF. Possibly using a different PDF parser will improve the results?

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by adrianeboyd
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
third-party Third-party packages and services
2 participants