Skip to content

crop pdf page not getting the expected result #979

@zorzigio

Description

@zorzigio

I am trying to crop an area of a pdf and I am not able to get the expected result using the transformation matrix.

The position of the area I am trying to extract is relative to the bottom left corner of the page.
The page is also rotated by 90 deg.

In the code below, the first page contains the extracted area using the transformation matrix which does not work properly, while the second page is extracted manually deriving the position of the area knowing the rotation of the page (which extracts the area correctly).

import fitz

filename = './table test.pdf'
pno = 0
# table1
x0 = 480
y0 = 470
w = 741
h = 823

x1 = x0 + w
y1 = y0 + h

src = fitz.open(filename)
spage = src[pno]
oldrot = spage.rotation
m0 = spage.transformation_matrix
spage.set_rotation(0)
doc = fitz.open()  # empty output PDF
r = spage.rect  # input page rectangle
d = fitz.Rect(
    spage.cropbox_position,  # CropBox displacement if not
    spage.cropbox_position  # starting at (0, 0)
)
# using transformation matrix
rect1 = fitz.Rect(y0, x0, y0+h, x0+w)
m1 = spage.transformation_matrix
rect2 = rect1*m1
# knowing how the page is rotated
x0b = r.width - y1
x1b = x0b + h
y0b = r.height - x1
y1b = y0b + w
rect3 = fitz.Rect(x0b, y0b, x1b, y1b)
rects = [rect2, rect3]
for rect in rects:
    page = doc.new_page(
        -1,
        width=w,
        height=h,
    )
    page.show_pdf_page(
        page.rect,  # fill all new page with the image
        src,  # input document
        spage.number,  # input page number
        clip=rect,  # which part to use of input page
        rotate=-oldrot,
    )
doc.save(
    'test.pdf',
    garbage=3,
    deflate=True,
)

I would much prefer using the transformation matrix, however I am not sure what I am doing wrong here?

Also, I was wondering if there is a method to deal with the rotation of the page automatically rather than having to rotate back and forth the page?

table test.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions