Advised way to extract text from Text blocks and Image blocks and combine them #2465

JawaClass · 2023-06-12T09:09:03Z

JawaClass
Jun 12, 2023

Hi,

I have problems understanding the transformation matrix. For people not good at math its very hard and I couldnt find an example in the documentation where its shown how to apply the transform value to the image to retrive the rendered image.

Bascially I have a image on my PDF that when converting the byte string to an PIL image its shown flipped horizontally and rotated 180° (I read that can happen due to the way the PDF creatore software converts image because no way the user created a flipped image).

It has the following transform matrix:
(245.6999969482422, 0.0, -0.0, -206.60000610351562, 136.7906036376953, 378.3905944824219)

After applying this its shown correctly and how its presented on the PDF page.

pil_image: PIL.Image = Image.open(io.BytesIO(block['image']))
numpy_data = deepcopy(np.asarray(pil_image))
numpy_data.setflags(write=1)
numpy_data = np.rot90(numpy_data, k=2)
numpy_data = np.fliplr(numpy_data)
img = Image.fromarray(numpy_data, 'RGB')
img.show()

I tried to transform the image with PIL affine transformation but transformed_image.show() ends up showing a black image with a bit of white on the top:

matrix = np.array(block['transform'])
# Apply the transformation matrix
transformed_image = pil_image.transform(pil_image.size, Image.AFFINE, matrix)
# Show the transformed image
transformed_image.show()

Help would be greatly appreciated.

JorjMcKie · 2023-06-12T09:57:36Z

JorjMcKie
Jun 12, 2023
Maintainer

Your matrix indeed performs an up-down flip (plus different scalings in x- / y-direction).
If I understand you correctly, you are looking for a way to transform an image that neutralizes the transformation applied to an image on its way to a document page.

Once you have your PIL image, you therefore need information on which of the many available Pillow transformations to use, right?

This works via the .transpose(trans) method if PIL.IMAGE. The parameter for this method selects the right Pillow transformation.

Here is a little utility function which should help you with this.
How to use:

In [1]: import fitz
In [2]: from matrix_property import matprop, piltrans
In [3]: mat = fitz.Matrix(245.6999969482422, 0.0, -0.0, -206.60000610351562, 13
   ...: 6.7906036376953, 378.3905944824219)
In [4]: matprop(mat)  # just to show that it detects the transformation
Out[4]: (2, 'up-down')
In [5]: piltrans(mat)  # determines the Pillow transformation method
Out[5]: <Transpose.FLIP_TOP_BOTTOM: 1>
In [6]: # if pil_image is your Pillow image, do this to make a transformed copy:
In [7]: trans = piltrans(mat)
In [8]: pil_transformed = pil_image.transform(trans)

matrix_property.zip

Please don't hesitate to come back with more questions.

2 replies

JawaClass Jun 12, 2023
Author

Yes, I want to neutralize any transformations applied to the image and I got it working with the help of the method in matrix_property.py.
Although the piltrans function seems to be missing from the zip file so I used this mapping to map from the int code to the PIL transpose constants.

mat_prop2pil_transpose = {
    0: [],
    1: [Image.Transpose.FLIP_LEFT_RIGHT],
    2: [Image.Transpose.FLIP_TOP_BOTTOM],
    3: [Image.Transpose.ROTATE_90],
    4: [Image.Transpose.ROTATE_180],
    5: [Image.Transpose.ROTATE_270],
    6: [Image.Transpose.FLIP_TOP_BOTTOM, Image.Transpose.ROTATE_90],
    7: [Image.Transpose.ROTATE_90, Image.Transpose.FLIP_TOP_BOTTOM]
}

and then appy these in a loop.

I would have two questions actually still.

While this is working, we only take into account changes in transposition (flipping and rotating) but not transfcormation (translating, scaling and shearing), right? Scaling is simply math it seems but what about the other 2? It thought thats what the CTM is for, all information in a single 3x3 matrix and then simply apply this to the image...somehow... (got my info from here: https://forum.patagames.com/posts/t501-What-Is-Transformation-Matrix-and-How-to-Use-It)
Also, the images returned from get_text("rawdict", flags=flags) dont contain the information if they are or have a image mask.
Are the image masks already applied to them or how would I get this information because they also dont return an xref nummer afaik?

Thanks for the detailled answers btw!

JorjMcKie Jun 12, 2023
Maintainer

At 2.
Transparency / soft mask information is not (and will never be) available via page.get_text() method variants. This is because they work for all supported document types - not just PDF.
As a fallback, you can use page.get_image_info(xrefs=True). This method automatically determines the xref if an image has one at all (not always the case!) - otherwise the "xref" key in the returned dictionary has value 0.
Use this xref to select the right item from page.get_images() and see if there is a positive mask.

At 1. The utility function indeed only detects 90°-based matrix types combined with flips. I was just too lazy to do all the arcus computations for other angles. Same is true for shearing. Scaling is easily detectable by looking at matrix.a / matrix.d. Translation is present in matrix.e and matrix.f.

For a complete picture, you should also look at the image EXIF info: there too (certain images only) may exist transformations you might want to get a hold of - in addition to the transformation matrix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Advised way to extract text from Text blocks and Image blocks and combine them #2465

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Advised way to extract text from Text blocks and Image blocks and combine them #2465

Uh oh!

JawaClass Jun 12, 2023

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

JorjMcKie Jun 12, 2023 Maintainer

Uh oh!

JawaClass Jun 12, 2023 Author

Uh oh!

JorjMcKie Jun 12, 2023 Maintainer

JawaClass
Jun 12, 2023

Replies: 1 comment 2 replies

JorjMcKie
Jun 12, 2023
Maintainer

JawaClass Jun 12, 2023
Author

JorjMcKie Jun 12, 2023
Maintainer