Skip to content
Discussion options

You must be logged in to vote

There indeed is a difference between the two ways:

  • images extracted via get_text("dict") internally are restricted to a clip rectangle equal to the page itself: any image not completely contained in page.rect is omitted
  • images reported via page.get_image_info() do not contain this restriction

You can adjust that by using clip=fitz.INFINITE_RECT() in the get_text() method.

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by JorjMcKie
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
not a bug not a bug / user error / unable to reproduce question
2 participants
Converted from issue

This discussion was converted from issue #2847 on November 29, 2023 11:53.