Skip to content
Discussion options

You must be logged in to vote

Use a variant of text extraction that delivers on line level together with position information.
Then make a pixmap of the line boundary box to output as png:

for block in page.get_text("dict", flags=fitz.TEXTFLAGS_TEXT)["blocks"]:
    for line in block["lines"]:
        bbox = line["bbox"]  # the line bbox
        text = " ".join([span["text"] for span in line["spans"]])  # text in line
        pix = page.get_pixmap(clip=bbox)  # pixmap of line bbox
        pix.save(...)

Replies: 3 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by JorjMcKie
Comment options

You must be logged in to vote
1 reply
@JorjMcKie
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants