Skip to content
Discussion options

You must be logged in to vote

MuPDF does not support all fonts. That as a preliminary statement.
And then there are fonts designed to be immunized against text extraction.
And then there are cases where text exists only as part inside images, or text appearing only as elementary drawinging operstions (like a capital "D" being drawn as a "|" followed by a left-open semi-circle, etc.).
In any of such cases you have to fallback to OCRing the page.
Please have a look at example scripts in the Utilities repository.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@hellohr11
Comment options

@JorjMcKie
Comment options

Answer selected by hellohr11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants