Skip to content

Conversation

@eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Feb 4, 2025

Benchmark:

bench_gcd: Mean +- std dev: [main] 194 us +- 5 us -> [pr2] 184 us +- 6 us: 1.05x faster
bench_uuid8: Mean +- std dev: [main] 1.69 us +- 0.04 us -> [pr2] 1.64 us +- 0.02 us: 1.03x faster
bench_id: Mean +- std dev: [main] 48.2 us +- 1.4 us -> [pr2] 41.6 us +- 1.0 us: 1.16x faster

Benchmark hidden because not significant (1): bench_int

Geometric mean: 1.06x faster

The bench_int is not affected since the compact ints already have a freelist in current main. The bench_id is a microbenchmark that does minimal amount of work. The bench_gcd and bench_uuid8 are benchmarks that are representative for real code.

Statistics for the number of allocations saved can be found in the corresponding issue.

Benchmark script:
# Quick benchmark for cpython freelists for int

from uuid import uuid8
import pyperf


def collatz(a):
    while a > 1:
        if a % 2 == 0:
            a = a // 2
        else:
            a = 3 * a + 1

def bench_int(loops):
    range_it = range(loops)
    tpl = tuple(range(200, 300))

    t0 = pyperf.perf_counter()
    for ii in range_it:
        for jj in tpl:
            collatz(jj)
    return pyperf.perf_counter() - t0

def gcd(x, y):
    while y != 0:
        (x, y) = (y, x % y)
    return x

def bench_gcd(loops):
    range_it = range(loops)
    tpl = tuple(range(100))

    t0 = pyperf.perf_counter()
    for ii in range_it:
        for a in tpl:
            gcd( (2<<120) + 1231231232131, 2<<30+ a)
    return pyperf.perf_counter() - t0

def bench_uuid8(loops):
    range_it = iter(range(loops))

    t0 = pyperf.perf_counter()
    for ii in range_it:
        _ = uuid8()
    return pyperf.perf_counter() - t0

def bench_id(loops):
    range_it = iter(range(loops))

    t0 = pyperf.perf_counter()
    for ii in range_it:
        for jj in range(1000):
            _ = id(jj)
    return pyperf.perf_counter() - t0


if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.bench_time_func("bench_int", bench_int)
    runner.bench_time_func("bench_gcd", bench_gcd)
    runner.bench_time_func("bench_uuid8", bench_uuid8)
    runner.bench_time_func("bench_id", bench_id)

assert(is_medium_int(x));

PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP(PyLongObject, ints);
PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP(PyLongObject, ints[1]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ints[1] is used here?
Maybe some comment on it will be useful?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we need only ndigits==1 integers.

New implementation uses several free lists (though, ints[0] isn't used).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ints[0] is indeed not used. It takes a tiny bit of extra memory, but makes the code cleaner.

_Py_FREELIST_FREE(ints, self, PyObject_Free);
if (PyLong_CheckExact(self)) {
if (!maybe_freelist_push(self)) {
PyObject_Free(self);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it equivalent to run PyObject_Free(self) and to run Py_TYPE(self)->tp_free(self) for integers with PyLong_MAXSAVESIZE or more digits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for exact ints (but might not be for subclasses).

Copy link
Member

@efimov-mikhail efimov-mikhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants