Remove image if page.get_image_info() returns multiple images at its location #3631
Unanswered
paulgekeler
asked this question in
Looking for help
Replies: 1 comment 1 reply
-
Without the PDF itself, there is no way to provide definitive advice, but you have a couple of options here:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, thank you for maintaining this project. It really is exceptional.
I am trying to replace images in pdfs with different ones I have stored locally. Thanks to this Github discussion https://github.com/pymupdf/PyMuPDF/discussions/924#discussioncomment-7249686 I have no problem doing this as follows:
However, I have encountered a pdf page (see below) where
get_image_info()
returns multiple images for a single image on the pdf:I would like to replace only one of them, i.e. the actual one visible on the page. I am not familiar with how pdfs are assembled or if images can be composed of multiple parts. Maybe thats what I'm missing.
I have tried to synchronise the source by calling
page.clean_contents()
before, but that doesn't help.Is there a way to recognise if images returned by
get_image_info()
are actually within the same image? (I could do something tedious like checking if bounding boxes are close enough, but that seems prone to errors.) I know the returned images have different xrefs so maybe they are different.Thank you for some needed insight.

Beta Was this translation helpful? Give feedback.
All reactions