How to extract the font properties of specific text? #1512
-
Can you let me know how to extract the font size and font name of only some part of text in the PDF? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
If you do So you can either select spans falling inside your area, or you can let PyMuPDF select only that part of the output intersecting your area: |
Beta Was this translation helpful? Give feedback.
If you do
page.get_text("dict")["blocks"]
, then each text block (one withblock["type"] == 0
) is a dictionary containing a list of line dictionaries, with in turn a list of sspan dictionaries.This hierarchy of dictionaries can be looked up here.
The span dictionaries contain the font name and size of the respective text portion - along with the rectangle containing that text.
So you can either select spans falling inside your area, or you can let PyMuPDF select only that part of the output intersecting your area:
page.get_text("dict", clip=area)...
.