Obtain the layer of each element in the PDF #2986
-
Is your feature request related to a problem? Please describe.
Describe the solution you'd like Describe alternatives you've considered Additional context |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 6 replies
-
Everything you want is already implemented it seems. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
This is a typical "Discussions" post. Let me move this accordingly. |
Beta Was this translation helpful? Give feedback.
-
This cannot be changed. You could delete the image of course and then re-insert it before everything else on the page. |
Beta Was this translation helpful? Give feedback.
-
That is what I was referring to: there is no way extract every object type in one single method. |
Beta Was this translation helpful? Give feedback.
Yes: method
Page.get_text("dict")
extracts text and images when using the defaultflags
.The sequence of the extracted image and text blocks are like in the page's
/Contents
.The full sequence of all boundary boxes of everything on the page is reflected by the list
page.get_bboxlog()
. The items in this list look like(obj-type, bbox)
.So you can take the bbox of an image or some text and then determine the index in the bboxlog that contains it.