Skip to content
Discussion options

You must be logged in to vote

Both methods work with completely different approaches. get_images() only works on PDFs, whereas get_image_info() works for all document types - just like get_text(), on which it is based.
The sets of images each of them reports are not equal in general. I am discussing the background in detail in the documentation.

To support that matching, get_image_info() supports the xrefs parameter. If True then image["xref"] can be used to locate the item in get_images().

But you can also use page.get_image_rects(item, transform=True) to get a list of locations of an image on the page (including the transformation matrix) using one of the items in get_images().

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by ayusonkj
Comment options

You must be logged in to vote
1 reply
@JorjMcKie
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants