Skip to content
Discussion options

You must be logged in to vote

The objects you are referring to are no images, but vector graphics - sometimes with overlaid text particles.
Vector graphics cannot be extracted as such - at least not in a way that you seem to be interested in.
You obviously would like to store them away as PNG / JPEG images.

PyMuPDF offers you to find / extract / group neighbored vector graphics on a page. This Page method is called cluster_drawings().
It returns a list of rectangles, each covering such a graphic. You can then make a "photo" of the corresponding page area (at any desired resolution) and store it away as an image. Here is a script that does this for the first two pages:

import pymupdf

doc = pymupdf.open("2501.10120v2.pdf"

Replies: 3 comments 5 replies

Comment options

You must be logged in to vote
1 reply
@jamesbraza
Comment options

Comment options

You must be logged in to vote
1 reply
@jamesbraza
Comment options

Answer selected by jamesbraza
Comment options

You must be logged in to vote
3 replies
@jamesbraza
Comment options

@JorjMcKie
Comment options

@jamesbraza
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #4583 on July 02, 2025 08:28.