Skip to content

Commit 36e1e1c

Browse files
committed
ENH: _font: Init FontDescriptor from font resource
This patch adds a class method to initialise FontDescriptor using a font resource dictionary. For now, this returns either a font name and FontDescriptor defaults, or, in case of one of the 14 core fonts, the associated font metrics. For future development, this class can be extended with code that collects character widths, which is now present in multple places, such as: - pypdf/_text_extraction/_layout_mode/_font.py - pypdf/_cmap.py ... and code that collects font descriptor information, which is now present nowhere yet, but would be useful for generating text streams.
1 parent 82c4f76 commit 36e1e1c

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

pypdf/_font.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
from dataclasses import dataclass, field
2+
from typing import Any, Optional
23

34

45
@dataclass(frozen=True)
@@ -24,3 +25,12 @@ class FontDescriptor:
2425
bbox: tuple[float, float, float, float] = field(default_factory=lambda: (-100.0, -200.0, 1000.0, 900.0))
2526

2627
character_widths: dict[str, int] = field(default_factory=dict)
28+
29+
@classmethod
30+
def from_font_res(cls, pdf_font_dict: dict[str, Any]) -> "Optional[FontDescriptor]":
31+
from pypdf._codecs.core_fontmetrics import CORE_FONT_METRICS # noqa: PLC0415
32+
# Prioritize information from the PDF font dictionary
33+
font_name = pdf_font_dict.get("/BaseFont", "Unknown")
34+
if font_name[1:] in CORE_FONT_METRICS:
35+
return CORE_FONT_METRICS.get(font_name[1:])
36+
return cls(name=font_name)

0 commit comments

Comments
 (0)