Skip to content

Get bounding box of an image using xref #906

@mohammadmjn

Description

@mohammadmjn

Is there any approach to get image bbox using xref or any other methods which are faster than page.get_image_bbox(image)? I used page.get_image_bbox(image) but it's slow for my use case where I have a vector PDF with a lot of small raster images (with more than 500 raster images). I have to detect the image bbox to check the size of each image with the page size to decide the whole page is a raster or vector. I also checked the xref but it gives me the actual image size that in most cases the image width and height we get from xref of the object is larger than its size in PDF (its width and height obtained from bbox). Here in my current code in which I want to replace page.get_image_bbox(image) with a solution based on xref or other faster alternatives to page.get_image_bbox:

def find_images_bbox(file_path):
    doc = fitz.open(path)
    page = doc[0]
    image_list = doc.get_page_images(0, full=True)
    for i in range(len(image_list)):
        image_bbox = page.get_image_bbox(image_list[i])
        print('image {} Bbox: {}'.format(i, image_bbox))
    doc.close()

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions