Is there a way to remove all images from a PDF file. #2283
-
Hello, I'm searching for a way of removing all images from a PDF file along with the labels printed on these images. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hm, removing images is possible in more than one way. The easiest way to remove images probably is using redaction annotations: loop over the pages, detect all image boundary boxes on it, add a redaction annot for each of the bboxes and then apply the redactions. If the "labels" refers to captions of text shown underneath an image: this is not encoded as such in PDF. You must find your own way to locate this text and either make an additional redaction covering it ... or increase the image bbox accordingly if you find a pattern for how these captions are positioned relative to the image. |
Beta Was this translation helpful? Give feedback.
-
By labels here I mean text printed on the image. Thanks, @JorjMcKie I'll use redaction annot. |
Beta Was this translation helpful? Give feedback.
Hm, removing images is possible in more than one way.
What does "labels printed on these images" mean?
The easiest way to remove images probably is using redaction annotations: loop over the pages, detect all image boundary boxes on it, add a redaction annot for each of the bboxes and then apply the redactions.
... not forgetting to use a suitable garbage collection option on save.
If the "labels" refers to captions of text shown underneath an image: this is not encoded as such in PDF. You must find your own way to locate this text and either make an additional redaction covering it ... or increase the image bbox accordingly if you find a pattern for how these captions are positioned rela…