[perf] Make ``PrettyPrinter`` format lazily so output can be budget-capped by Pierre-Sassoulas · Pull Request #14588 · pytest-dev/pytest

Pierre-Sassoulas · 2026-06-13T16:24:01Z

Refactor required prior to #14523.

_format and the per-type helpers now yield their output as a stream of string chunks instead of writing to a file-like object, and pformat joins them. On top of that, pformat_lines pulls from the formatter only until a budget is reached:

pformat_lines(obj, max_lines=None, max_chars=None)

It stops on the first chunk that reaches either budget, so a huge collection costs O(budget) rather than O(N). Either dimension may be None (unbounded); with both None the whole object is formatted.

Benchmark (PrettyPrinter alone, width 80)::

list(range(500_000)):
    pformat().splitlines()        ~805 ms
    pformat_lines(max_lines=11)   ~0.027 ms      (~30000x)

[8 small ints] (common small diff):
    pformat().splitlines()        ~0.0133 ms
    pformat_lines(max_lines=11)   ~0.0163 ms (+3µs)

["x"*100_000] * 3 (flat, few huge elements):
    pformat_lines(max_chars=640)  stops after ~100_000 chars
                                  (one element) instead of 300_000

bluetech

Thanks, seems like a nice optimization to me, and I'd say makes the code nicer as well.

bluetech · 2026-06-14T08:07:18Z

+    ) -> list[str]:
+        """Pretty-print ``object`` and return its lines.
+
+        ``_format`` yields the output as a stream of chunks, so this can


Probably a "public" method shouldn't reference a private method in its docstring. I would describe the behavior rather than the implementation here.

bluetech · 2026-06-14T08:08:58Z

+        unbounded. With both ``None`` the whole object is formatted. The
+        budget is a stopping condition, not a precise cut: formatting
+        stops on the first chunk that reaches it, so the result may
+        slightly overshoot (the caller truncates to the exact limit).


Would remove the detail of what the caller does, or replace "truncates" with "can truncate" or "should truncate".

bluetech · 2026-06-14T08:09:50Z

+
+    def pformat_lines(
+        self,
+        object: Any,


I suggest enforcing max_lines and max_chars are kw-only. Otherwise they can be easily confused.

Yes, definitely.

bluetech · 2026-06-14T08:12:51Z

-        self._format_items(object, stream, indent, allowance, context, level)
-        stream.write(endchar)
+        try:
+            object = sorted(object)


This is an unrelated optimization, or is it somehow related to the iterator change?

If it's an optimization, a comment would be helpful, something like: "Try direct sort first, faster than the fallback.".

Yes sorry I didn't realize this was in this commit. It's indeed an optimization (rather big one imho). Because it uses C sort directly it's a lot faster, and also I suppose that we rarely have heterogenous data structure. (Then it's a little longer of course). Here's a micro-benchmark:

Details

"""Interleaved benchmark of the set-sort fast path in ``_pprint_set``. Compares ``sorted(s)`` against ``sorted(s, key=_safe_key)`` (the prior behaviour), and the heterogeneous fallback ``try sorted / except retry`` against always using ``_safe_key``. Interleaving: A and B are timed back-to-back every iteration, so slow drift (CPU frequency scaling, thermal, GC) hits both equally and cancels in the per-iteration difference. We report the median per-call time of each and the median of the *paired* differences (B - A), which is the robust estimate of the real gap. """ import statistics import time from _pytest._io.pprint import _safe_key def bench_pair(a, b, inner, outer): """Interleave-time ``a`` and ``b``; return per-call medians + paired diff.""" ta_samples, tb_samples, diffs = [], [], [] for _ in range(outer): t = time.perf_counter() for _ in range(inner): a() ta = (time.perf_counter() - t) / inner t = time.perf_counter() for _ in range(inner): b() tb = (time.perf_counter() - t) / inner ta_samples.append(ta) tb_samples.append(tb) diffs.append(tb - ta) return ( statistics.median(ta_samples) * 1000, # ms statistics.median(tb_samples) * 1000, # ms statistics.median(diffs) * 1e6, # us, paired ) print("plain sorted (A) vs sorted(key=_safe_key) (B), homogeneous:") for label, data, inner, outer in [ ("int set 1k", set(range(1000)), 100, 300), ("int set 100k", set(range(100_000)), 3, 80), ("str set 100k", {f"item-{i}" for i in range(100_000)}, 1, 80), ]: a = lambda d=data: sorted(d) b = lambda d=data: sorted(d, key=_safe_key) ma, mb, diff = bench_pair(a, b, inner, outer) print( f" {label:13} A={ma:9.4f} ms B={mb:9.4f} ms B/A={mb / ma:5.1f}x" f" paired B-A={diff:+10.1f} us" ) print("\nheterogeneous (unorderable mix), fallback path:") het = {1, "a", 2, "b", 3.5, None} | {f"x{i}" for i in range(500)} | set(range(500)) try: sorted(het) raise SystemExit("expected TypeError - set is not heterogeneous") except TypeError: pass def safe_sort(d=het): return sorted(d, key=_safe_key) def failed_sort(d=het): # the *only* extra work the try/except adds: a plain sort that raises try: sorted(d) except TypeError: pass # Measure the overhead directly: the failed sort is the whole cost of the # fallback's try/except. Comparing new(=failed+safe) vs old(=safe) is # useless here — the ~us signal is far below the jitter of the ~ms safe # sort, so the paired diff is noise (can even come out negative). overhead_ms, safe_ms, _ = bench_pair(failed_sort, safe_sort, 20, 300) print( f" {'hetero 1k':13} _safe_key sort={safe_ms:8.4f} ms" f" try/except overhead (failed sort)={overhead_ms * 1000:6.1f} us" f" = {overhead_ms / safe_ms * 100:.2f}% of the sort" )

Result on my machine:

plain sorted (A) vs sorted(key=_safe_key) (B), homogeneous: int set 1k A= 0.0153 ms B= 0.4661 ms B/A= 30.4x paired B-A= +451.4 us int set 100k A= 1.8300 ms B= 63.6966 ms B/A= 34.8x paired B-A= +61787.4 us str set 100k A= 54.3415 ms B= 356.0647 ms B/A= 6.6x paired B-A= +302171.5 us heterogeneous (unorderable mix), fallback path: hetero 1k _safe_key sort= 4.1314 ms try/except overhead (failed sort)= 12.2 us = 0.30% of the sort

bluetech · 2026-06-14T08:16:11Z

+    pp = PrettyPrinter()
+    assert pp.pformat({3, 1, 2}) == "{\n    1,\n    2,\n    3,\n}"
+    # Mixed unorderable types must not raise.
+    pp.pformat({1, "a", 2, "b"})


Might as well assert it as well.

…apped ``_format`` and the per-type helpers now ``yield`` their output as a stream of string chunks instead of writing to a file-like object, and ``pformat`` joins them. On top of that, ``pformat_lines`` pulls from the formatter only until a budget is reached: pformat_lines(obj, max_lines=None, max_chars=None) It stops on the first chunk that reaches *either* budget, so a huge collection costs O(budget) rather than O(N). Either dimension may be ``None`` (unbounded); with both ``None`` the whole object is formatted. Motivation ---------- Assertion diffs are truncated to a handful of lines/chars before being shown. Formatting the whole of a large ``==`` comparison and then throwing almost all of it away is pure waste. With a lazy formatter the truncating caller simply stops pulling once it has enough. Benchmark (``PrettyPrinter`` alone, width 80):: list(range(500_000)): pformat().splitlines() ~805 ms pformat_lines(max_lines=11) ~0.027 ms (~30000x) [8 small ints] (common small diff): pformat().splitlines() ~0.0133 ms pformat_lines(max_lines=11) ~0.0185 ms (+~5 us) ["x"*100_000] * 3 (flat, few huge elements): pformat_lines(max_chars=640) stops after ~100_000 chars (one element) instead of 300_000 Why a lazy generator rather than a fast path + budget stream ------------------------------------------------------------ An earlier approach kept a cheap ``pformat().splitlines()`` fast path guarded by ``len(obj) <= max_lines`` plus a flatness check, falling back to a write-intercepting budget-stream class for the rest. Two problems: * ``len(obj)`` is only a *lower* bound on the line count — one nested element (``[{...50 keys...}]``) expands to many lines — so the guard needed the flatness scan to stay correct, and even then it bounded only *lines*, never *chars*: a flat container of a few enormous strings has almost no lines but blows the char budget. * it was two code paths plus a stream class plus an exception used for control flow. Because the formatter is lazy, "stop pulling at the budget" is the whole optimisation: correct regardless of how lines/chars are distributed across elements, bounding both dimensions, with no ``len()`` proxy to get wrong and no fast/slow branch. The common small-diff case costs only ~5 us more than the unbounded path (it is never the bottleneck — a failing assertion isn't hot), while large comparisons drop by orders of magnitude. ``_pprint_set``/``_pprint_dict`` also try a plain ``sorted`` first and fall back to the ``_safe_key`` wrapper only for unorderable mixes. This diverges structurally from the upstream cpython ``pprint`` it was vendored from; the module header notes it is no longer kept in sync. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

In ``pformat_lines``'s budget loop, ``chunk.count("\n")`` ran on every chunk, but most chunks (brackets, indentation, item reprs) contain no newline. Guarding the call with ``"\n" in chunk`` skips it on those and recovers part of the per-chunk budget-tracking overhead: formatting an 8-element list under a budget drops from ~0.0185 ms to ~0.0163 ms (versus ~0.0132 ms for an uncapped ``pformat().splitlines()``, so the budget overhead roughly halves, from ~+5 us to ~+3 us). The win is small and only matters on the ``-v`` truncating path of a failing assertion (the default path doesn't format the diff at all), so this is kept as a separate commit — easy to drop if the extra branch isn't judged worth it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Addresses review on pytest-dev#14588: * make ``max_lines`` / ``max_chars`` keyword-only so they can't be confused at the call site. * drop the implementation detail (``_format``) and the "what the caller does" note from the docstring; describe the behaviour instead. * comment the set-sort fast path ("try a direct sort first, faster than the fallback"). * assert the heterogeneous-set output in the test rather than only checking it does not raise. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Pierre-Sassoulas added the skip news used on prs to opt out of the changelog requirement label Jun 13, 2026

Pierre-Sassoulas force-pushed the pprint-lazy-budget branch from 35ed5b2 to d4b901c Compare June 13, 2026 16:30

Pierre-Sassoulas marked this pull request as draft June 13, 2026 16:30

Pierre-Sassoulas force-pushed the pprint-lazy-budget branch 2 times, most recently from 133da41 to f4bd109 Compare June 13, 2026 17:12

Pierre-Sassoulas marked this pull request as ready for review June 14, 2026 05:30

Pierre-Sassoulas requested a review from bluetech June 14, 2026 07:53

bluetech approved these changes Jun 14, 2026

View reviewed changes

Pierre-Sassoulas force-pushed the pprint-lazy-budget branch from dbd78b3 to 1786e34 Compare June 14, 2026 08:44

Pierre-Sassoulas and others added 3 commits June 14, 2026 11:00

Pierre-Sassoulas force-pushed the pprint-lazy-budget branch from 1786e34 to abf4962 Compare June 14, 2026 09:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[perf] Make `PrettyPrinter` format lazily so output can be budget-capped#14588

[perf] Make `PrettyPrinter` format lazily so output can be budget-capped#14588
Pierre-Sassoulas wants to merge 3 commits into
pytest-dev:mainfrom
Pierre-Sassoulas:pprint-lazy-budget

Pierre-Sassoulas commented Jun 13, 2026 •

edited

Loading

Uh oh!

bluetech left a comment

Uh oh!

bluetech Jun 14, 2026

Uh oh!

bluetech Jun 14, 2026

Uh oh!

bluetech Jun 14, 2026

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

bluetech Jun 14, 2026

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

bluetech Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Pierre-Sassoulas commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bluetech left a comment

Choose a reason for hiding this comment

Uh oh!

bluetech Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

bluetech Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

bluetech Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

bluetech Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

bluetech Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pierre-Sassoulas commented Jun 13, 2026 •

edited

Loading