Get bounding box of an image FAST #908
-
Is there any approach to get image bbox using xref or any other methods which are faster than
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 14 replies
-
You don't seem to need the xref at all, do you? Or any detail on how the page appearance references the image? If this is true, I recommend you use text extraction - although this seems not to be obvious: pprint([b for b in page.get_text("blocks") if b[-1] == 1]) # take only image blocks
[(344.25,
88.93597412109375,
540.0,
175.18597412109375,
'<image: DeviceRGB, width 261, height 115, bpc 8>',
0,
1)] An image block is represented by a 1 as last item. The first 4 items of each block represent the bbox of the text block, in our case the bbox of the image. In [8]: %timeit imgs=[b for b in page.get_text("blocks") if b[-1] == 1]
22.4 ms ± 245 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [9]: images = doc.get_page_images(1,full=True)
In [10]: %timeit imgs=[page.get_image_bbox(i) for i in images]
2.46 s ± 10.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [11]: So you have 22.4 milliseconds versus 2.46 seconds. |
Beta Was this translation helpful? Give feedback.
-
The reason why we have such an apparent functional overlap here is, that the text extraction works for all document types - not just PDFs. The |
Beta Was this translation helpful? Give feedback.
You don't seem to need the xref at all, do you? Or any detail on how the page appearance references the image?
If I get you right, all you need are bbox coordinates of raster images actually shown on the page.
If this is true, I recommend you use text extraction - although this seems not to be obvious:
There is a performance oriented variant, which delivers text blocks of which every image is represented by a line of text with image metadata:
An image block is represented by a 1 a…