Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Sep 22, 2025

Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the PyBytesWriter API.

…ing()

Replace PyBytes_FromStringAndSize() and _PyBytes_Resize() with the
PyBytesWriter API.
@vstinner
Copy link
Member Author

Benchmark:

import pyperf
runner = pyperf.Runner()
sizes = (3, 100, 1000)
for size in sizes:
    runner.timeit(f'{size:,} ASCII chars',
        setup=f's="x"*{size}',
        stmt='s.encode("raw_unicode_escape")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-2 chars',
        setup=f's=chr(0x20ac) * {size}',
        stmt='s.encode("raw_unicode_escape")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-4 chars',
        setup=f's=chr(0x10ffff) * {size}',
        stmt='s.encode("raw_unicode_escape")')

Results:

Benchmark ref pep782
3 UCS-2 chars 513 ns 521 ns: 1.02x slower
100 UCS-2 chars 842 ns 854 ns: 1.01x slower
1,000 UCS-2 chars 3.66 us 3.67 us: 1.00x slower
3 UCS-4 chars 513 ns 526 ns: 1.03x slower
100 UCS-4 chars 1.02 us 1.04 us: 1.01x slower
1,000 UCS-4 chars 5.10 us 5.14 us: 1.01x slower
Geometric mean (ref) 1.01x slower

Benchmark hidden because not significant (3): 3 ASCII chars, 100 ASCII chars, 1,000 ASCII chars

The code path for U+0000-U+00FF characters is unchanged, so it's normal that the benchmark is not significant on ASCII characters.

@vstinner vstinner merged commit 49e83e3 into python:main Sep 22, 2025
47 checks passed
@vstinner vstinner deleted the raw_unicode_escape branch September 22, 2025 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant