Skip to content
Discussion options

You must be logged in to vote

Your way of image extraction is unable to deal with images having an image mask.
Your PDF however has 2 images, each with an image mask:

>>> from pprint import pprint
>>> 
>>> pprint(page.get_images(True))
[(19, 25, 419, 64, 8, 'DeviceRGB', '', 'Img1', 'FlateDecode', 0),
 (20, 26, 419, 64, 8, 'DeviceRGB', '', 'Img10', 'FlateDecode', 0)]
>>> 

to extract such images, a special coding must be used: e-g- for the first one (xref 19, mask xref 25):

pix19 = fitz.Pixmap(doc, 19)
mask = fitz.Pixmap(doc, 25)
pix = fitz.Pixmap(pix19, mask)
pix.save("test.png")  # fully recovered image

Replies: 8 comments 7 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@YashMistry349
Comment options

Answer selected by YashMistry349
Comment options

You must be logged in to vote
4 replies
@SummerXXXX
Comment options

@JorjMcKie
Comment options

@SummerXXXX
Comment options

@SummerXXXX
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@JorjMcKie
Comment options

Comment options

You must be logged in to vote
1 reply
@SummerXXXX
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants
Converted from issue

This discussion was converted from issue #1406 on November 16, 2021 10:53.