gh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeString() #139250

vstinner · 2025-09-22T21:15:50Z

Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

Issue: Use PyBytesWriter in Unicode codecs #139156

…ing() Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

vstinner · 2025-09-22T21:17:14Z

Benchmark:

import pyperf
runner = pyperf.Runner()
sizes = (3, 100, 1000)
for size in sizes:
    runner.timeit(f'{size:,} ASCII chars',
        setup=f's="x"*{size}',
        stmt='s.encode("raw_unicode_escape")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-2 chars',
        setup=f's=chr(0x20ac) * {size}',
        stmt='s.encode("raw_unicode_escape")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-4 chars',
        setup=f's=chr(0x10ffff) * {size}',
        stmt='s.encode("raw_unicode_escape")')

Results:

Benchmark	ref	pep782
3 UCS-2 chars	513 ns	521 ns: 1.02x slower
100 UCS-2 chars	842 ns	854 ns: 1.01x slower
1,000 UCS-2 chars	3.66 us	3.67 us: 1.00x slower
3 UCS-4 chars	513 ns	526 ns: 1.03x slower
100 UCS-4 chars	1.02 us	1.04 us: 1.01x slower
1,000 UCS-4 chars	5.10 us	5.14 us: 1.01x slower
Geometric mean	(ref)	1.01x slower

Benchmark hidden because not significant (3): 3 ASCII chars, 100 ASCII chars, 1,000 ASCII chars

The code path for U+0000-U+00FF characters is unchanged, so it's normal that the benchmark is not significant on ASCII characters.

pythongh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeStr…

b731f07

…ing() Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

vstinner added the skip news label Sep 22, 2025

bedevere-app bot added the awaiting core review label Sep 22, 2025

bedevere-app bot mentioned this pull request Sep 22, 2025

Use PyBytesWriter in Unicode codecs #139156

Closed

vstinner merged commit 49e83e3 into python:main Sep 22, 2025
47 checks passed

vstinner deleted the raw_unicode_escape branch September 22, 2025 21:46

bedevere-app bot removed the awaiting core review label Sep 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeString() #139250

gh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeString() #139250

Uh oh!

vstinner commented Sep 22, 2025 •

edited by bedevere-app bot

Loading

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

gh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeString() #139250

gh-139156: Use PyBytesWriter in PyUnicode_AsRawUnicodeEscapeString() #139250

Uh oh!

Conversation

vstinner commented Sep 22, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vstinner commented Sep 22, 2025 •

edited by bedevere-app bot

Loading