Skip to content
Discussion options

You must be logged in to vote

Looking closer at the images on page 2, you will see that a number of them has masks, these items have a second entry > 0, e.g. (53, 90, 173, 173, 8, 'DeviceRGB', '', 'Image53', 'DCTDecode') has 90 there. This is an image mask which must be applied to get the full image. The following snippet only extracts images with a mask and recovers the full picture by applying the mask to the base image:

for item in page.get_images():
    xref = item[0]  # base image xref
    mask = item[1]  # mask xref
    if mask == 0: continue  # ignore if no masked image
    pix0 = fitz.Pixmap(doc, xref)  # pixmap of base image
    if pix0.alpha: pix0 = fitz.Pixmap(pix0, 0)  # remove alpha channel if present
    p…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by JorjMcKie
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants