Skip to content

Commit 9973a77

Browse files
committed
Update matrix comments
1 parent 3b104fb commit 9973a77

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

app/backend/prepdocslib/pdfparser.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,9 @@ def crop_image_from_pdf_page(doc: pymupdf.Document, page_number, bounding_box) -
216216
# Cropping the page. The rect requires the coordinates in the format (x0, y0, x1, y1).
217217
bbx = [x * 72 for x in bounding_box]
218218
rect = pymupdf.Rect(bbx)
219-
# 72 is the DPI ? what? explain this from CU
219+
# Bounding box is scaled to 72 dots per inch
220+
# We assume the PDF has 300 DPI
221+
# The matrix is used to convert between these 2 units
220222
pix = page.get_pixmap(matrix=pymupdf.Matrix(300 / 72, 300 / 72), clip=rect)
221223

222224
img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)

0 commit comments

Comments
 (0)