-
Normally I thought |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 6 replies
-
A span dictionary's The vertical distance between lines lies completely in the hand of the document creator. Although the font properties Taking the example font Helvetica, you will see the values: In [1]: import fitz
In [2]: font = fitz.Font("helv")
In [3]: font.ascender, font.descender
Out[3]: (1.0750000476837158, -0.29899999499320984)
In [4]: fontsize = 10
In [6]: (font.ascender - font.descender)*fontsize
Out[6]: 13.740000426769257
In [7]: So the standard line, span and each character's height is 13.74 for fontsize 10 in this font. The MuPDF standard to compute a characters height is also While PyMuPDF cannot influence the logic internally used in redaction logic, you can force almost all other text-related logic to use the fontsize as character / span / line height: |
Beta Was this translation helpful? Give feedback.
-
Ah, got you. The value reported in |
Beta Was this translation helpful? Give feedback.
-
BTW, the bbox heights of both spans are the same: for b in page.get_text("dict",sort=True)["blocks"]:
for l in b["lines"]:
for s in l["spans"]:
bbox=fitz.Rect(l["bbox"])
print(bbox.height, s["size"], s["text"])
13.31008529663086 9.962639808654785 Thus, I came to the conclusion that the designer of a ne
13.31008529663086 9.892655372619629 Thus, I came to the conclusion that the designer of a new If you use fitz.TOOLS.set_small_glyph_heights(True)
True
for b in page.get_text("dict",sort=True)["blocks"]:
for l in b["lines"]:
for s in l["spans"]:
bbox=fitz.Rect(l["bbox"])
print(bbox.height, s["size"], s["text"])
9.962638854980469 9.962639808654785 Thus, I came to the conclusion that the designer of a ne
9.892658233642578 9.892655372619629 Thus, I came to the conclusion that the designer of a new |
Beta Was this translation helpful? Give feedback.
-
Your example is definitely an interesting one! In [16]: bbox0.height / ascdsc
Out[16]: 9.962639173338859 Which is the fontsize of the In [20]: bbox0.height / ascdsc - s0["size"]
Out[20]: -6.353159260896746e-07 But the second span gives us this - a quite notable deviation: In [21]: bbox1.height / ascdsc - s1["size"]
Out[21]: 0.06998380071923016 From this you can calculate all you need I suppose. |
Beta Was this translation helpful? Give feedback.
BTW, the bbox heights of both spans are the same:
If you use
fitz.TOOLS.set_small_glyph_heights(True)
you get this: