Skip to content

Conversation

@kangtastic
Copy link

@kangtastic kangtastic commented Mar 16, 2023

Synopsis

Add Ascii85 and base85 encoder and decoder functions implemented in C to binascii and use them to greatly improve the performance and reduce the memory usage of the existing Ascii85, base85, and Z85 codec functions in base64.

No API or documentation changes are necessary with respect to any functions in base64, and all existing unit tests for those functions continue to pass without modification.

Resolves: gh-101178

Discussion

The base85-related functions in base64 are now wrappers for the new functions in binascii, as envisioned in the docs:

The binascii module contains a number of methods to convert between binary and various ASCII-encoded binary representations. Normally, you will not use these functions directly but use wrapper modules like uu or base64 instead. The binascii module contains low-level functions written in C for greater speed that are used by the higher-level modules.

Parting out Ascii85 from base85 and Z85 was warranted in my testing despite the code duplication due to the various performance-murdering special cases in Ascii85.

Comments and questions are welcome.

Benchmarks

Updated April 20, 2025.

# bench_b85.py

# Note: EXTREMELY SLOW on unmodified mainline CPython
#       when tracing malloc on the base-85 functions.

import base64
import sys
import timeit
import tracemalloc

funcs = [(base64.b64encode, base64.b64decode),  # sanity check/comparison
         (base64.a85encode, base64.a85decode),
         (base64.b85encode, base64.b85decode),
         (base64.z85encode, base64.z85decode)]

def mb(n):
    return f"{n / 1024 / 1024:.3f} MB"

def stats(func, data, t, m):
    name, n, bps = func.__qualname__, len(data), len(data) / t
    print(f"{name} : {n} b in {t:.3f} s ({mb(bps)}/s) using {mb(m)}")

if __name__ == "__main__":
    data = b"a" * int(sys.argv[1]) * 1024 * 1024
    for fenc, fdec in funcs:
        tracemalloc.start()
        enc = fenc(data)
        menc = tracemalloc.get_traced_memory()[1] - len(enc)
        tracemalloc.stop()
        tenc = timeit.timeit("fenc(data)", number=1, globals=globals())
        stats(fenc, data, tenc, menc)

        tracemalloc.start()
        dec = fenc(enc)
        mdec = tracemalloc.get_traced_memory()[1] - len(dec)
        tracemalloc.stop()
        tdec = timeit.timeit("fdec(enc)", number=1, globals=globals())
        stats(fdec, enc, tdec, mdec)
# Python 3.14.0a7+ commit 78cfee6f09
# ./configure --enable-optimizations --with-lto

# With this PR
$ time ./python bench_b85.py 64
b64encode : 67108864 b in 0.084 s (763.340 MB/s) using 42.667 MB
b64decode : 89478488 b in 0.230 s (371.074 MB/s) using 56.889 MB
a85encode : 67108864 b in 0.190 s (336.115 MB/s) using 0.000 MB
a85decode : 83886080 b in 0.216 s (370.605 MB/s) using 0.000 MB
b85encode : 67108864 b in 0.072 s (887.955 MB/s) using 0.000 MB
b85decode : 83886080 b in 0.175 s (457.224 MB/s) using 0.000 MB
z85encode : 67108864 b in 0.072 s (891.721 MB/s) using 0.000 MB
z85decode : 83886080 b in 0.174 s (460.582 MB/s) using 0.000 MB

real    0m2.231s
user    0m2.064s
sys     0m0.156s

# Unmodified
$ time ./python bench_b85.py 64
b64encode : 67108864 b in 0.082 s (781.718 MB/s) using 42.667 MB
b64decode : 89478488 b in 0.237 s (360.686 MB/s) using 56.889 MB
a85encode : 67108864 b in 7.492 s (8.543 MB/s) using 2664.406 MB
a85decode : 83886080 b in 14.264 s (5.609 MB/s) using 3332.254 MB
b85encode : 67108864 b in 7.181 s (8.912 MB/s) using 2664.404 MB
b85decode : 83886080 b in 8.486 s (9.427 MB/s) using 3332.254 MB
z85encode : 67108864 b in 7.343 s (8.715 MB/s) using 2664.102 MB
z85decode : 83886080 b in 8.778 s (9.113 MB/s) using 3332.254 MB

real    9m2.346s
user    8m47.248s
sys     0m12.460s

The old pure-Python implementation is two orders of magnitude slower and uses over O(40n) temporary memory.

@ghost
Copy link

ghost commented Mar 16, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@kangtastic kangtastic changed the title Add Ascii85 and base85 support to binascii gh-101178: Add Ascii85 and base85 support to binascii Mar 16, 2023
@arhadthedev arhadthedev added the stdlib Standard Library Python modules in the Lib/ directory label Mar 23, 2023
@kangtastic
Copy link
Author

kangtastic commented Mar 19, 2024

It's a year later, and Z85 support has been added to base64 in the meantime. So while bringing this PR up to date with main, I added Z85 support to it as well.

For reference, this is the benchmark run that led me to do so.

# After merging main but before adding Z85 support to this PR
(cpython-b85) $ python bench_b85.py 64
b64encode : 67108864 b in 0.121 s (527.435 MB/s) using 42.667 MB
b64decode : 89478488 b in 0.309 s (276.188 MB/s) using 56.889 MB
a85encode : 67108864 b in 0.297 s (215.150 MB/s) using 0.000 MB
a85decode : 83886080 b in 0.205 s (390.751 MB/s) using 0.000 MB
b85encode : 67108864 b in 0.106 s (604.359 MB/s) using 0.000 MB
b85decode : 83886080 b in 0.204 s (393.040 MB/s) using 0.000 MB
z85encode : 67108864 b in 0.204 s (313.610 MB/s) using 80.000 MB
z85decode : 83886080 b in 0.300 s (266.670 MB/s) using 100.000 MB

The existing Z85 implementation translates from the standard base85 alphabet to Z85 after the fact and within Python, so it was already benefiting from this PR but with substantial performance and memory usage overhead. That overhead is now gone.

@kangtastic kangtastic force-pushed the gh-101178-rework-base85 branch from 71f1955 to 7b4aba1 Compare March 19, 2024 09:27
@python-cla-bot
Copy link

python-cla-bot bot commented Apr 18, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

Add Ascii85, base85, and Z85 encoders and decoders to `binascii`,
replacing the existing pure Python implementations in `base64`.

No API or documentation changes are necessary with respect to
`base64.a85encode()`, `b85encode()`, etc., and all existing unit
tests for those functions continue to pass without modification.

Note that attempting to decode Ascii85 or base85 data of length 1 mod 5
(after accounting for Ascii85 quirks) now produces an error, as no
encoder would emit such data. This should be the only significant
externally visible difference compared to the old implementation.

Resolves: pythongh-101178
@kangtastic kangtastic force-pushed the gh-101178-rework-base85 branch from 7b4aba1 to 05ae5ad Compare April 21, 2025 05:16
@kangtastic
Copy link
Author

PR has been rebased onto main at 78cfee6 with squashing.

@kangtastic kangtastic changed the title gh-101178: Add Ascii85 and base85 support to binascii gh-101178: Add Ascii85. base85, and Z85 support to binascii Apr 21, 2025
@kangtastic kangtastic changed the title gh-101178: Add Ascii85. base85, and Z85 support to binascii gh-101178: Add Ascii85, base85, and Z85 support to binascii Apr 21, 2025
@sergey-miryanov
Copy link
Contributor

Note that attempting to decode Ascii85, base85, or Z85 data of length 1 mod 5 now produces an error, as no encoder would emit such data. This should be the only significant externally visible difference compared to the old implementations.

I believe you have to document this change.

@kangtastic
Copy link
Author

Note that attempting to decode Ascii85, base85, or Z85 data of length 1 mod 5 now produces an error, as no encoder would emit such data. This should be the only significant externally visible difference compared to the old implementations.

I believe you have to document this change.

Fair point, I could do that.

In case anyone argues for keeping the old behavior (silently ignoring length 1 mod 5), I won't do it just yet.

_A85START = b"<~"
_A85END = b"~>"

def _85encode(b, chars, chars2, pad=False, foldnuls=False, foldspaces=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per PEP-0399, the Python implementation must be kept, with the C accelerator and Python implementation tested to ensure they produce identical output.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From PEP-0399:

If an acceleration module is provided it is to be named the same as the module it is accelerating with an underscore attached as a prefix, e.g., _warnings for warnings. The common pattern to access the accelerated code from the pure Python implementation is to import it with an import *, e.g., from _warnings import *. This is typically done at the end of the module to allow it to overwrite specific Python objects with their accelerated equivalents.

Although the effect is the same, there is a subtle difference in that strictly speaking, this PR isn't providing alternative C implementations for the base 85-related pure-Python functions in base64. It's adding new functions into the existing binascii C module and turning said Python functions into wrappers for them, which is in keeping with how binascii and base64 have historically been interrelated.

That difference means the guidelines in PEP-0399 don't apply cleanly. So e.g. creating a new _base64 C module doesn't make sense. Neither does trying to use the accelerated routines only if available, as binascii will always be available.

Do you have any thoughts on how to keep the Python implementation in a way that works with Python's import system? I'm not familiar with an analogous situation in the rest of the codebase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #91610 a C version is added for deepcopy and unit tests are created for both the c and python implementation. If you search a bit in the codebase you can find some more examples.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, all. I assumed binascii, being part of the stdlib, would/should always be available.

I've since found that quopri doesn't make that assumption. I'll do what it does.

If we were strictly following PEP-0399, _base64 would be a C
module for accelerated functions in base64. Due to historical
reasons, those should actually go in binascii instead.

We still want to preserve the existing Python code in base64.
Parting out facilities for accessing the C functions into a
module named _base64 shouldn't risk a naming conflict and
will simplify testing.
This is done differently to PEP-0399 to minimize the number of
changed lines.
As we're now keeping the existing Python base 85 functions, the C
implementations should behave exactly the same, down to exception
type and wording. It is also no longer an error to try to decode
data of length 1 mod 5.
@kangtastic
Copy link
Author

The PR has been updated to preserve the existing base 85 Python functions in base64 and modify the new base 85 C functions in binascii to closely match their behavior. Notably, trying to decode data of length 1 mod 5 is no longer an error.

Lib/_base64.py Outdated
"""C accelerator wrappers for originally pure-Python parts of base64."""

from binascii import Error, a2b_ascii85, a2b_base85, b2a_ascii85, b2a_base85
from base64 import _bytes_from_decode_data, bytes_types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid import cycles like this, it can make refactoring in the future harder.

Lib/base64.py Outdated
try:
from _base64 import (_a85encode, _a85decode, _b85encode,
_b85decode, _z85encode, _z85decode)
from functools import update_wrapper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functools is an expensive import, I would copy the relative parts of update_wrapper() locally.

c_base64 = import_fresh_module("base64", fresh=["_base64"])


def with_c_implementation(test_func):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a decorator, perhaps use the mixin approach from other modules.

Importing update_wrapper() from functools to copy attributes
is expensive. Do it ourselves for only the most relevant ones.
This requires some code duplication, but oh well.
Using a decorator complicates function signature introspection.
Do we really need to test the legacy API twice?
@kangtastic kangtastic closed this Apr 29, 2025
@kangtastic kangtastic reopened this Apr 29, 2025
Lib/base64.py Outdated
Comment on lines 581 to 582
from _base64 import (_a85encode, _a85decode, _b85encode,
_b85decode, _z85encode, _z85decode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given these are already in a private module, you can remove the prefix. That means the _copy_attributes function only needs to copy __doc__, and __module__ can be set to the static 'base64'.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@kangtastic
Copy link
Author

PR was accidentally closed due to misclicking on mobile. There should be a confirmation dialog or something 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review stdlib Standard Library Python modules in the Lib/ directory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

base64.b85encode uses significant amount of RAM

6 participants