Skip to content

Conversation

@PJBrs
Copy link
Contributor

@PJBrs PJBrs commented Sep 13, 2025

This patch implements text wrapping and alignment in appearance streams.

The scale_text method was vibe-coded, as well as the code for right-aligned text and centered text, but they both work great.

The result offers a good basis for text wrapping. I did notice, however, that the results with pdftk are better. In the future, it would be nice to read the info for the annotation border from the annotiation instead of just adding some padding here and there (which is the case now). Also, I notice there's also an annotation option called "comb" that is not taken into account. Then there is annotation text colour... Finally, pdftk takes into account the font bounding box / ascent in deciding scaled font size.

For now, however, this PR "finishes" PDF flattening in the sense that it correctly wraps long texts and aligns it as intended.

Related but not fixed here: #2153
I think this does fix the alignment part of #1919

@codecov
Copy link

codecov bot commented Sep 13, 2025

Codecov Report

❌ Patch coverage is 97.82609% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.12%. Comparing base (3b5c85f) to head (fcb45b3).

Files with missing lines Patch % Lines
pypdf/generic/_appearance_stream.py 97.67% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3465      +/-   ##
==========================================
+ Coverage   97.10%   97.12%   +0.01%     
==========================================
  Files          57       57              
  Lines        9711     9778      +67     
  Branches     1759     1773      +14     
==========================================
+ Hits         9430     9497      +67     
  Misses        168      168              
  Partials      113      113              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@PJBrs PJBrs marked this pull request as draft September 15, 2025 12:02
@PJBrs
Copy link
Contributor Author

PJBrs commented Sep 26, 2025

This is a reworked version on top of #3466

Not for review right now.

PJBrs added 2 commits November 5, 2025 14:54
mypy complained that the .from_font_resource method's return
type is Optional[FontDescriptor]. Change the code to not
confuse mypy.
This adds a method to calculate the width of a text
string. This method can later be used to wrap text
at a certain length.

Code blatantly copied from the _font.py file in the
text extractor code.
@PJBrs PJBrs marked this pull request as ready for review November 5, 2025 14:35
@PJBrs
Copy link
Contributor Author

PJBrs commented Nov 5, 2025

@stefan6419846 First, thanks very much for merging the refactoring of appearance stream code from _writer.py to generic_appearance_stream.py!

With that in place, it should now be easier to review this PR, which adds text wrapping, scaling and alignment for text appearance streams.

PJBrs added 8 commits November 5, 2025 17:46
This patch adds a method to scale and wrap text,
depending on whether or not text is allowed to be
wrapped.

It takes a couple of arguments, including the text
string itself, field width and height, font size,
a FontDescriptor with character widths, and a bool
specifying whether or not text is allowed to wrap.

Returns the text in in the form of list of tuples,
each tuple containing the length of a line and its
contents, and the font size for these lines and
lengths.
This patch scales and/or wrap text that does not
fit into a text field unaltered, under the condition
that font size was set to 0 in the default
appearance stream.

We only wrap text if the multiline bit was set in
the corresponding annotation's field flags, otherwise
we just scale the font until it fits.

We move the escaping of parentheses below, so that it
does not interfere with calculating the width of a
text string.
Make sure that we always have Helvetica as a viable font
resource, for which we surely have all necessary font
metrics needed for text wrapping.
This patch changes the TextAppearanceStream code
so that it can deal with right alignment and centered
text.

Note that both require correct font metrics in order
to work.
We need the info that is in CORE_FONT_METRICS, and that is the same
information as in _default_fonts_space_width anyway. So this patch
removes a bit of redundancy.
Add tests for the TextStreamAppearance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants