-
Notifications
You must be signed in to change notification settings - Fork 683
Description
I am trying to crop an area of a pdf and I am not able to get the expected result using the transformation matrix.
The position of the area I am trying to extract is relative to the bottom left corner of the page.
The page is also rotated by 90 deg.
In the code below, the first page contains the extracted area using the transformation matrix which does not work properly, while the second page is extracted manually deriving the position of the area knowing the rotation of the page (which extracts the area correctly).
import fitz
filename = './table test.pdf'
pno = 0
# table1
x0 = 480
y0 = 470
w = 741
h = 823
x1 = x0 + w
y1 = y0 + h
src = fitz.open(filename)
spage = src[pno]
oldrot = spage.rotation
m0 = spage.transformation_matrix
spage.set_rotation(0)
doc = fitz.open() # empty output PDF
r = spage.rect # input page rectangle
d = fitz.Rect(
spage.cropbox_position, # CropBox displacement if not
spage.cropbox_position # starting at (0, 0)
)
# using transformation matrix
rect1 = fitz.Rect(y0, x0, y0+h, x0+w)
m1 = spage.transformation_matrix
rect2 = rect1*m1
# knowing how the page is rotated
x0b = r.width - y1
x1b = x0b + h
y0b = r.height - x1
y1b = y0b + w
rect3 = fitz.Rect(x0b, y0b, x1b, y1b)
rects = [rect2, rect3]
for rect in rects:
page = doc.new_page(
-1,
width=w,
height=h,
)
page.show_pdf_page(
page.rect, # fill all new page with the image
src, # input document
spage.number, # input page number
clip=rect, # which part to use of input page
rotate=-oldrot,
)
doc.save(
'test.pdf',
garbage=3,
deflate=True,
)I would much prefer using the transformation matrix, however I am not sure what I am doing wrong here?
Also, I was wondering if there is a method to deal with the rotation of the page automatically rather than having to rotate back and forth the page?