gh-129813, PEP 782: Use PyBytesWriter in utf8_encoder() #138874

vstinner · 2025-09-13T18:20:27Z

Replace the private _PyBytesWriter API with the new public PyBytesWriter API in utf8_encoder() and unicode_encode_ucs1().

Issue: [C API] PEP 782: Add PyBytesWriter API #129813

Replace the private _PyBytesWriter API with the new public PyBytesWriter API in utf8_encoder() and unicode_encode_ucs1().

vstinner · 2025-09-13T18:37:40Z

Microbenchmark on the UTF-8 encoder:

import pyperf
runner = pyperf.Runner()
runner.timeit('abc',
    setup='s="abc"',
    stmt='s.encode()')
runner.timeit('a x 1000',
    setup='s="a" * 1000',
    stmt='s.encode()')
runner.timeit('ab<surrogate> [namereplace]',
    setup=r's="ab\udc80"',
    stmt='s.encode(errors="namereplace")')
runner.timeit('ab<surrogate> [ignore]',
    setup=r's="ab\udc80"',
    stmt='s.encode(errors="ignore")')
runner.timeit('(a<surrogate>) x 1000 [namereplace]',
    setup=r's="a\udc80" * 1000',
    stmt='s.encode(errors="namereplace")')
runner.timeit('(a<surrogate>) x 1000 [ignore]',
    setup=r's="a\udc80" * 1000',
    stmt='s.encode(errors="ignore")')

Results:

Benchmark	bench1_ref	bench1_pep782
abc	34.5 ns	35.8 ns: 1.04x slower
a x 1000	98.5 ns	103 ns: 1.05x slower
ab [namereplace]	643 ns	694 ns: 1.08x slower
ab [ignore]	108 ns	113 ns: 1.05x slower
(a) x 1000 [namereplace]	225 us	223 us: 1.01x faster
(a) x 1000 [ignore]	3.72 us	3.57 us: 1.04x faster
Geometric mean	(ref)	1.03x slower

vstinner · 2025-09-13T18:48:28Z

cc @serhiy-storchaka

serhiy-storchaka · 2025-09-13T19:09:48Z

Please make benchmarks for non-ASCII strings. Consider different ranges (which represents different internal representations and lenghts of UTF-8 representation):

0-0x7F
0x80-0xFF
0x100-0x3FF
0x400-0xFFFF
0x10000-0x10FFFF

Consider also strings which contain one character from the higher range (for example, 0x10000 and all other characters ASCII, etc).

vstinner · 2025-09-15T10:57:13Z

More benchmark:

import pyperf
runner = pyperf.Runner()
ranges = (
    (r'\0',
     r'\x7f'),
    (r'\x80',
     r'\xff'),
    (r'\u0400',
     r'\u0fff'),
    (r'\U00010000',
     r'\u0010ffff'),
)
for first, last in ranges:
    runner.timeit(f'"{first}{last}"',
        setup=f"first='{first}'; last='{last}'; s=first+last",
        stmt='s.encode()')
for length in (5, 50, 500):
    for first, last in ranges:
        runner.timeit(f'"{first}{last}" * {length}',
            setup=f"first='{first}'; last='{last}'; s=(first+last) * {length}",
            stmt='s.encode()')

Results:

Benchmark	bench2_ref	bench2_pep782
"\0\x7f"	34.9 ns	33.6 ns: 1.04x faster
"\x80\xff"	46.5 ns	43.2 ns: 1.08x faster
"\u0400\u0fff"	51.1 ns	53.9 ns: 1.05x slower
"\0\x7f" * 5	38.0 ns	39.0 ns: 1.03x slower
"\x80\xff" * 5	51.2 ns	53.8 ns: 1.05x slower
"\U00010000\u0010ffff" * 5	74.6 ns	75.7 ns: 1.02x slower
"\0\x7f" * 50	39.6 ns	39.9 ns: 1.01x slower
"\u0400\u0fff" * 50	191 ns	193 ns: 1.01x slower
"\U00010000\u0010ffff" * 50	386 ns	392 ns: 1.02x slower
"\x80\xff" * 500	959 ns	982 ns: 1.02x slower
"\u0400\u0fff" * 500	1.43 us	1.48 us: 1.03x slower
Geometric mean	(ref)	1.01x slower

Benchmark hidden because not significant (5): "\U00010000\u0010ffff", "\u0400\u0fff" * 5, "\x80\xff" * 50, "\0\x7f" * 500, "\U00010000\u0010ffff" * 500

vstinner · 2025-09-15T11:04:18Z

Benchmark:

import pyperf
runner = pyperf.Runner()
for length in (5, 50, 500):
    runner.timeit(f'"x" * {length} + chr(0x10000)',
        setup=f's="x" * {length} + chr(0x10000)',
        stmt='s.encode()')

Results:

Benchmark	bench3_ref	bench3_pep782
"x" * 5 + chr(0x10000)	51.6 ns	49.5 ns: 1.04x faster
"x" * 50 + chr(0x10000)	78.9 ns	77.4 ns: 1.02x faster
Geometric mean	(ref)	1.02x faster

Benchmark hidden because not significant (1): "x" * 500 + chr(0x10000)

vstinner · 2025-09-15T12:17:20Z

@serhiy-storchaka: The difference is about a "few nanoseconds", around +10 ns in the worst case, 1.08x faster in the best case. Do you think that it's acceptable?

serhiy-storchaka

Some overhead is caused by dynamic memory allocation for PyBytesWriter -- it is unavoidable. But there may be a loss due to losing fine control on overallocation -- I need to look at it closer. Maybe it can be avoided by adding a new C API.

Objects/unicodeobject.c

Objects/stringlib/codecs.h

vstinner · 2025-09-15T14:38:07Z

Some overhead is caused by dynamic memory allocation for PyBytesWriter -- it is unavoidable.

PyBytesWriter_Create() uses a free list to avoid PyMem_Malloc() cost in the common case.

But there may be a loss due to losing fine control on overallocation -- I need to look at it closer. Maybe it can be avoided by adding a new C API.

If this loss can be measured, I would suggest adding a private API to disable overallocation.

vstinner · 2025-09-18T16:05:40Z

Updated benchmark on the worst case:

import pyperf
runner=pyperf.Runner()
runner.timeit('utf8',
    setup=r's="\uFFFF"*(256//3)+"\uDC80"',
    stmt='s.encode(errors="backslashreplace")')
runner.timeit('latin1',
    setup=r"s=('a'*255+'\u0100')",
    stmt="s.encode('latin1', 'backslashreplace')")

Result:

Benchmark	bench4_ref	bench4_pep782
utf8	256 ns	263 ns: 1.03x slower
latin1	278 ns	291 ns: 1.05x slower
Geometric mean	(ref)	1.04x slower

vstinner · 2025-09-18T20:38:01Z

@serhiy-storchaka: utf8_encoder() and unicode_encode_ucs1() are the last 2 functions using the private API. I would like to merge this change to be able to remove the private API, even if there is an overhead on performance. On the common cases, there is no significant impact on performance.

vstinner · 2025-09-18T20:38:27Z

I updated the PR to keep the overallocate=0 optimization.

vstinner · 2025-09-18T20:46:15Z

More benchmarks on the corner cases.

Benchmark: python -m pyperf timeit -s "s=('a'*100+'\u0100'*100)" "s.encode('latin1', 'backslashreplace')"

Result: Mean +- std dev: [timeit1_ref] 503 ns +- 37 ns -> [timeit1_pep782] 483 ns +- 25 ns: 1.04x faster

Benchmark: python -m pyperf timeit -s "s=(('a'*10+'\u0100')*10)" "s.encode('latin1', 'backslashreplace')"

Result: Mean +- std dev: [timeit2_ref] 243 ns +- 13 ns -> [timeit2_pep782] 248 ns +- 4 ns: 1.02x slower

vstinner · 2025-09-18T21:41:28Z

I updated the PR to reimplement the min_size micro-optimization. There is no more 1.3x slowdown.

I also recomputed all benchmark results on the latest PR version. Results are now between 1.08x slower and 1.08x faster. Most benchmarks are in the [-5%, +5%] range which can be associated to noise in the benchmark (can be ignored).

@serhiy-storchaka: I plan to merge this change next week.

serhiy-storchaka

LGTM. 👍

Objects/unicodeobject.c

Remove useless PyBytesWriter_Discard() call

vstinner · 2025-09-23T09:47:31Z

Merged, thanks for the review @serhiy-storchaka.

bedevere-bot · 2025-09-23T10:44:39Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 Ubuntu Shared 3.x (tier-1) has failed when building commit 8cfd7b4.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/506/builds/11478) and take a look at the build logs.
Check if the failure is related to this commit (8cfd7b4) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/506/builds/11478

Failed tests:

test_interpreters

Failed subtests:

test_keyboard_interrupt_in_thread_running_interp - test.test_interpreters.test_api.InterpreterObjectTests.test_keyboard_interrupt_in_thread_running_interp

Summary of the results of the build (if available):

==

Click to see traceback logs

Traceback (most recent call last):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_interpreters/test_api.py", line 462, in test_keyboard_interrupt_in_thread_running_interp
    self.assertEqual(retcode, 0)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^
AssertionError: -2 != 0

pythongh-129813, PEP 782: Use PyBytesWriter in utf8_encoder()

051d070

Replace the private _PyBytesWriter API with the new public PyBytesWriter API in utf8_encoder() and unicode_encode_ucs1().

vstinner added the skip news label Sep 13, 2025

bedevere-app bot mentioned this pull request Sep 13, 2025

[C API] PEP 782: Add PyBytesWriter API #129813

Closed

vstinner marked this pull request as ready for review September 15, 2025 10:50

bedevere-app bot added the awaiting core review label Sep 15, 2025

serhiy-storchaka reviewed Sep 15, 2025

View reviewed changes

Objects/unicodeobject.c Show resolved Hide resolved

Objects/unicodeobject.c Show resolved Hide resolved

Objects/stringlib/codecs.h Show resolved Hide resolved

Fix leak: call PyBytesWriter_Discard()

7181b30

vstinner added 4 commits September 17, 2025 15:48

Merge branch 'main' into pybyteswriter_encode_utf8

8dcf030

Restore code to disable overallocation

ecdece1

Fix test_codeccallbacks

65e90a6

Merge branch 'main' into pybyteswriter_encode_utf8

90981ee

Restore min_size optimization

c9abefd

serhiy-storchaka approved these changes Sep 22, 2025

View reviewed changes

Objects/unicodeobject.c Outdated Show resolved Hide resolved

bedevere-app bot added awaiting merge and removed awaiting core review labels Sep 22, 2025

Address Serhiy's review

7212e74

Remove useless PyBytesWriter_Discard() call

vstinner merged commit 8cfd7b4 into python:main Sep 23, 2025
43 checks passed

vstinner deleted the pybyteswriter_encode_utf8 branch September 23, 2025 09:47

bedevere-app bot removed the awaiting merge label Sep 23, 2025

Uh oh!

gh-129813, PEP 782: Use PyBytesWriter in utf8_encoder() #138874

gh-129813, PEP 782: Use PyBytesWriter in utf8_encoder() #138874

Uh oh!

Conversation

vstinner commented Sep 13, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 13, 2025

Uh oh!

serhiy-storchaka commented Sep 13, 2025

Uh oh!

vstinner commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 15, 2025

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vstinner commented Sep 15, 2025

Uh oh!

vstinner commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 18, 2025

Uh oh!

vstinner commented Sep 18, 2025

Uh oh!

vstinner commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 18, 2025

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vstinner commented Sep 23, 2025

Uh oh!

bedevere-bot commented Sep 23, 2025

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Uh oh!

Uh oh!

vstinner commented Sep 13, 2025 •

edited by bedevere-app bot

Loading

vstinner commented Sep 13, 2025 •

edited

Loading

vstinner commented Sep 15, 2025 •

edited

Loading

vstinner commented Sep 15, 2025 •

edited

Loading

vstinner commented Sep 18, 2025 •

edited

Loading

vstinner commented Sep 18, 2025 •

edited

Loading