Undoing image transformations like flipping #3274

Irajan · 2024-03-18T07:51:18Z

Irajan
Mar 18, 2024

Description of the bug

I tried to extract images from the pdf document that contains image and tried to run OCR on the extracted image in order to get the image data. I tried using Page.get_image_info() method but the OCR result is not giving me the result as expected.

How to reproduce the bug

I tried debugging by printing the detected image with the following code and found that the detected images are flipped unexpectedly. Need help to find out if there's something wrong in my implementation or there's a bug in the image detection.

import fitz
from PIL import Image
import io

doc = fitz.open("sample.pdf")

for page_index , page in enumerate(doc):

  image_blocks = page.get_image_info(xrefs=True)
  count = 0

  for image in image_blocks:
      if not image["xref"]:
        continue

      base_image_info = doc.extract_image(image["xref"])
      image_bytes = base_image_info["image"]

      image = Image.open(io.BytesIO(image_bytes))
      image.save(f"./images/image__{page_index + 1}__{count}.png")
      count = count + 1

Sample pdf file is
sample.pdf

Extracted images
image__1__0

image__1__1

PyMuPDF version

1.23.26

Operating system

Linux

Python version

3.11

Answered by JorjMcKie

Mar 18, 2024

You obviously have to take page rotation into account, because the transformation matrix (please read the documentation!) takes everything into account required to place the image on page: image's native transformation and page rotation.
You have several options, but the simplest thing is probably to remove page rotation before you extract the image. Us this function for de-rotating.

View full answer

JorjMcKie · 2024-03-18T09:46:28Z

JorjMcKie
Mar 18, 2024
Maintainer

The PDF creator has rotated the images flipped top-bottom.
This is not a PyMuPDF problem.
You could find out what happened during image insertion to the PDF page by looking at the transformation matrix used. For example image at xref 7:

import fitz
doc=fitz.open("sample.pdf")
page=doc[0]
from pprint import pprint
pprint(page.get_images())
[(7, 0, 1318, 1018, 8, 'DeviceGray', '', 'R183', 'FlateDecode'),
 (8, 0, 1318, 1018, 8, 'DeviceGray', '', 'R182', 'FlateDecode'),
 (9, 0, 1318, 1018, 8, 'DeviceGray', '', 'R181', 'FlateDecode'),
 (10, 0, 1318, 1018, 8, 'DeviceGray', '', 'R180', 'FlateDecode'),
 (11, 0, 1318, 1018, 8, 'DeviceGray', '', 'R179', 'FlateDecode'),
 (12, 0, 1318, 1018, 8, 'DeviceGray', '', 'R178', 'FlateDecode'),
 (13, 0, 1318, 1018, 8, 'DeviceGray', '', 'R177', 'FlateDecode'),
 (14, 0, 1318, 1018, 8, 'DeviceGray', '', 'R176', 'FlateDecode'),
 (15, 0, 1318, 1018, 8, 'DeviceGray', '', 'R175', 'FlateDecode'),
 (24, 0, 155, 73, 8, 'DeviceGray', '', 'R8', 'DCTDecode')]
img7=doc.extract_image(7)
img7.keys()
dict_keys(['ext', 'smask', 'width', 'height', 'colorspace', 'bpc', 'xres', 'yres', 'cs-name', 'image'])
page.get_image_rects(7,transform=True)
[(Rect(1431.9599609375, 334.1400146484375, 2043.8399658203125, 1126.1400146484375), Matrix(0.0, -792.0, -611.8800048828125, -0.0, 2043.8399658203125, 1126.1400146484375))]

As you can see, the transformation matrix has negative values for matrix.b and matrix.c and hence does a top-bottom flip.

0 replies

JorjMcKie · 2024-03-18T09:47:45Z

JorjMcKie
Mar 18, 2024
Maintainer

The transformation matrix is also part of page.get_image_info().

0 replies

Irajan · 2024-03-18T09:54:48Z

Irajan
Mar 18, 2024
Author

Can't we prevent top-bottom flip and extract the image as how it is originally. @JorjMcKie

0 replies

JorjMcKie · 2024-03-18T09:59:11Z

JorjMcKie
Mar 18, 2024
Maintainer

The image's original is flipped! The PDF creator has "un-flipped" it to properly display on the page.

0 replies

Irajan · 2024-03-18T10:04:29Z

Irajan
Mar 18, 2024
Author

Is't there any way to undo the "un-flipped" action by pdf-creator. This would help me restore the original content of the document

0 replies

JorjMcKie · 2024-03-18T10:11:42Z

JorjMcKie
Mar 18, 2024
Maintainer

Is't there any way to undo the "un-flipped" action by pdf-creator. This would help me restore the original content of the document

What you actually mean is to do the same thing as the creator: un-flip the image, right?
Take the image and use e.g. Pillow for unflipping using Image.Transpose.FLIP_TOP_BOTTOM.

0 replies

Irajan · 2024-03-18T10:38:29Z

Irajan
Mar 18, 2024
Author

I've tried this as well but the problem I ran into was there is combined case for page flip as well as image flip.
For example :
If I have the transformation matrix of page as
(-612.0, -0.0, -0.0, 792.0, 899.3999633789062, 203.4000244140625)
and the transformation matrix for image as
(0.0, -792.0, -611.8800048828125, -0.0, 2043.8399658203125, 1126.1400146484375)

This shows the case page is not flipped and image is flipped
Perfect works !!

But for the transformation matrix
(-612.0, -0.0, -0.0, -792.0, 899.3999633789062, 203.4000244140625) for page
and same as above for the image

this shows that the page and image both are flipped and the above un flipping technique doesn't work
any suggestion on this ??

0 replies

JorjMcKie · 2024-03-18T10:40:22Z

JorjMcKie
Mar 18, 2024
Maintainer

let's convert this to a discussion first ...

0 replies

JorjMcKie · 2024-03-18T10:43:32Z

JorjMcKie
Mar 18, 2024
Maintainer

Here is a little helper for interpreting an image transformation matrix:
matrix_property.zip

The function returns appropriate Pillow image transformation actions. When multiple actions are required, use multiple Pillow actions.

5 replies

Irajan Mar 18, 2024
Author

Above function gives me the way how image is transformed. But I'm troubling on dealing with combined effect of both page transformation and image transformation.

JorjMcKie Mar 18, 2024
Maintainer

what does that mean?

Irajan Mar 18, 2024
Author

for page_index , page in enumerate(doc):
  image_blocks = page.get_image_info(xrefs=True)
  for image in image_blocks:
      if not image["xref"]:
        continue

      base_image_info = doc.extract_image(image["xref"])
      image_bytes = base_image_info["image"]
      (a,b,c,d,e,f) = image['transform']
      (pa,pb, pc, pd, pe, pf) = page.transformation_matrix
      print(page.transformation_matrix)
      print(image['transform'])

This gives me output as
(-612.0, -0.0, -0.0, -792.0, 899.3999633789062, 203.4000244140625)
(0.0, -792.0, -611.8800048828125, -0.0, 2043.8399658203125, 1126.1400146484375)

and your suggested approach

for page_index , page in enumerate(doc):
  image_blocks = page.get_image_info(xrefs=True)
  for image in image_blocks:
      if not image["xref"]:
        continue

      base_image_info = doc.extract_image(image["xref"])
      image_bytes = base_image_info["image"]
      (a,b,c,d,e,f) = image['transform']
      (pa,pb, pc, pd, pe, pf) = page.transformation_matrix
      print(page.transformation_matrix)
      print(image['transform'])
      
      if d < 0:
         image = image.transpose(Image.Transpose.FLIP_TOP_BOTTOM)

doesn't work. Since page is also flipped

JorjMcKie Mar 18, 2024
Maintainer

You obviously have to take page rotation into account, because the transformation matrix (please read the documentation!) takes everything into account required to place the image on page: image's native transformation and page rotation.
You have several options, but the simplest thing is probably to remove page rotation before you extract the image. Us this function for de-rotating.

Answer selected by Irajan

Irajan Mar 18, 2024
Author

Thanks for the suggestion I guess I found the right option.

Undoing image transformations like flipping #3274

Uh oh!

Uh oh!

Irajan Mar 18, 2024

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Replies: 9 comments · 5 replies

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

Uh oh!

Irajan Mar 18, 2024 Author

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

Irajan Mar 18, 2024 Author

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

Uh oh!

Irajan Mar 18, 2024 Author

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

Irajan Mar 18, 2024 Author

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

Uh oh!

Irajan Mar 18, 2024 Author

Uh oh!

JorjMcKie Mar 18, 2024 Maintainer

Uh oh!

Irajan Mar 18, 2024 Author

Irajan
Mar 18, 2024

Replies: 9 comments 5 replies

JorjMcKie
Mar 18, 2024
Maintainer

JorjMcKie
Mar 18, 2024
Maintainer

Irajan
Mar 18, 2024
Author

JorjMcKie
Mar 18, 2024
Maintainer

Irajan
Mar 18, 2024
Author

JorjMcKie
Mar 18, 2024
Maintainer

Irajan
Mar 18, 2024
Author

JorjMcKie
Mar 18, 2024
Maintainer

JorjMcKie
Mar 18, 2024
Maintainer

Irajan Mar 18, 2024
Author

JorjMcKie Mar 18, 2024
Maintainer

Irajan Mar 18, 2024
Author

JorjMcKie Mar 18, 2024
Maintainer

Irajan Mar 18, 2024
Author