gh-140232: Do not track frozenset objects with immutables #140234

eendebakpt · 2025-10-16T21:01:34Z

In the PR we untrack frozen tuples for the normal constructors. There are a few methods shared between the set and frozenset (for example set_intersection in setobject.c) where we have not added the untracking. (this is possible, but I am not sure this is worthwhile to do).

Here is a small script to test the idea:

import gc
import time
from statistics import mean

number_of_iterations = 20
number_of_gc_iterations = 50

deltas = []

gc.disable()
gc.collect()
for kk in range(number_of_iterations):
    t0 = time.perf_counter()
    for jj in range(number_of_gc_iterations):
        gc.collect()
    dt = time.perf_counter() - t0
    deltas.append(dt)
print(f"time per collection: mean {1e3 * mean(deltas) / number_of_iterations:.3f} [ms], min {1e3 * min(deltas) / number_of_iterations:.3f} [ms]")

sets = [frozenset([ii]) for ii in range(10_000)]
deltas = []
print("---")
gc.disable()
gc.collect()
for kk in range(number_of_iterations):
    t0 = time.perf_counter()
    for jj in range(number_of_gc_iterations):
        gc.collect()
    dt = time.perf_counter() - t0
    deltas.append(dt)
print(f"time per collection: mean {1e3 * mean(deltas) / number_of_iterations:.3f} [ms], min {1e3 * min(deltas) / number_of_iterations:.3f} [ms]")

#%% Show statistics of frozen containers

gc.collect()

def candidate(obj):
    return all(not gc.is_tracked(x) for x in obj)

for immutable_type in (tuple, frozenset):
    number_of_objects_tracked = 0
    number_of_candidates = 0
    number_of_immutable_candidates = 0

    for obj in gc.get_objects():
        number_of_objects_tracked += 1
        if type(obj) is immutable_type:
            number_of_candidates += 1
            # print(f"{type(obj)} = {obj}")
            if candidate(obj):
                number_of_immutable_candidates += 1

    print(f"type {immutable_type}")
    print(f"  {number_of_objects_tracked=}")
    print(f"  {number_of_candidates=}")
    print(f"  {number_of_immutable_candidates=}")

It measures the performance of garbage collection, and outputs some statistics for the numbers of frozen containers.

Main:

time per collection: mean 1.311 [ms], min 1.301 [ms]
---
time per collection: mean 2.467 [ms], min 2.272 [ms]
type <class 'tuple'>
  number_of_objects_tracked=18330
  number_of_candidates=546
  number_of_immutable_candidates=1
type <class 'frozenset'>
  number_of_objects_tracked=18330
  number_of_candidates=10059
  number_of_immutable_candidates=10057

PR

time per collection: mean 1.285 [ms], min 1.251 [ms]
---
time per collection: mean 1.424 [ms], min 1.396 [ms]
type <class 'tuple'>
  number_of_objects_tracked=8273
  number_of_candidates=546
  number_of_immutable_candidates=6
type <class 'frozenset'>
  number_of_objects_tracked=8273
  number_of_candidates=2
  number_of_immutable_candidates=0

Issue: Disable tracking of frozenset objects with immutables in the GC #140232

Objects/setobject.c

Co-authored-by: Mikhail Efimov <[email protected]>

sergey-miryanov · 2025-10-17T06:03:52Z

Maybe it is worth to change tp_alloc for something like:

PyObject *
PyFrozenSet_Alloc(PyTypeObject *type, Py_ssize_t nitems)
{
    PyObject *obj = PyType_GenericAlloc(type, nitems);
    if (obj == NULL) {
        return NULL;
    }

    _PyFrozenSet_MaybeUntrack(obj);
    return obj;
}

eendebakpt · 2025-10-17T06:52:35Z

Maybe it is worth to change tp_alloc for something like:

The tp_alloc is used in make_new_set, which in turn is called by make_new_set. The last one is used set_intersection which modifies a frozenset. So adding _PyFrozenSet_MaybeUntrack to tp_alloc would mean we have to add a _PyFrozenSet_MaybeTrack to the end of set_intersection. This is a complication I do not want to tackle (certainly not in this PR).

Lib/test/test_sys.py

sergey-miryanov

Code looks good to me.

…cpython into frozenset_immutable_tracking

Modules/_testcapimodule.c

sergey-miryanov · 2025-10-24T11:41:19Z

Modules/_testcapimodule.c

+    PyObject *set = NULL, *empty_tuple=NULL, *tracked_object;
+
+
+    tracked_object = PyImport_ImportModule("sys");


Maybe just create empty list or dict here? Importing module seems too heavy for testing purpose.

I agree a module seems heavy, but we cannot add a list or dict to a set. A custom class or namedtuple would also do, but they require more code to setup. But any suggestion for a simple-to-create hashable object tracked by the GC is welcome.

Ough, I forgot about the requirement to be a hashable. Thanks!
WDYT about exposing function for PySet_Add to _testcapi and write tests in python like I did this https://github.com/python/cpython/pull/140132/files#diff-70eaebed435342e02ba8f7f5a84e4eebd552438ce6ac2765e80abb5514bdea03R134?

Then you can write test like:

class Test: pass fs = pyset_add(Test())

See the reply to victor below. Just adding pyset_add will not work. We could add a _testcapi.test_pyset_add(tracked_item), but then we have the test functionality spread over both the python and c side.

vstinner · 2025-10-24T12:51:02Z

Would it be possible to write tests in Python rather than in C?

eendebakpt · 2025-10-24T22:15:30Z

Would it be possible to write tests in Python rather than in C?

I tried, but it is not easy. We have to expose PySet_Add (frozenset().add does not exist on the python side). I added pyset_add on the _testcapi module (with pyset_add just calling PySet_Add). But running this on a frozenset from the python side does not work: when calling _testcapi.pyset_add(frozen_set, item) there too many references to the frozen_set and PySet_Add will fail with an internal error here:

cpython/Objects/setobject.c

Line 2778 in d78d7a5

PyErr_BadInternalCall();

And when calling _testcapi.pyset_add(frozenset(), item) we do not have the frozenset available to test whether tracking has been enabled.

sergey-miryanov · 2025-10-26T08:07:22Z

And when calling _testcapi.pyset_add(frozenset(), item) we do not have the frozenset available to test whether tracking has been enabled.

IIUC, if you return the first argument from pyset_add then you can test it on the python side.

eendebakpt · 2025-10-26T19:20:57Z

And when calling _testcapi.pyset_add(frozenset(), item) we do not have the frozenset available to test whether tracking has been enabled.

IIUC, if you return the first argument from pyset_add then you can test it on the python side.

Ok, I gave it another try. The first attempt failed, but by using the vectorcall convention I can keep the reference count at 1 also from the Python side.

Objects/setobject.c

Lib/test/test_set.py

efimov-mikhail · 2025-10-26T20:26:54Z

Objects/setobject.c

+    Py_ssize_t pos = 0;
+    setentry *entry;
+    while (set_next((PySetObject *)op, &pos, &entry)) {
+        if (_PyObject_GC_MAY_BE_TRACKED(entry->key)) {


Maybe we should use faster _PyType_IS_GC(Py_TYPE(entry->key)) as in maybe_tracked from Objects/tupleobject.c?

Not sure performance matters a lot, but I would prefer to have it consistent with what is used in tupleobject.c. Unless there are objections, I will change the implementation to use the maybe_tracked.

Objects/setobject.c

Co-authored-by: Mikhail Efimov <[email protected]>

…cpython into frozenset_immutable_tracking

vstinner · 2025-10-28T16:42:38Z

Objects/setobject.c

    return make_new_set(type, iterable);
 }

+void


Please add a comment to explain the purpose of this function with a link to the GitHub issue.

Objects/setobject.c

vstinner · 2025-10-28T16:46:33Z

Objects/setobject.c

+void
+_PyFrozenSet_MaybeUntrack(PyObject *op)
+{
+    if (op == NULL || !PyFrozenSet_CheckExact(op)) {


Why not untracking frozenset subtypes?

We can make a cycle using frozenset subtype:

>>> class F(frozenset): ... pass ... >>> f = F([1,2,3]) >>> f.cycle = f

So, we need to track them.

Oh ok. In this case, please add a comment explaining that :-)

Modules/_testcapimodule.c

Lib/test/test_set.py

vstinner · 2025-10-28T16:50:19Z

Lib/test/test_set.py

+        # Test the PySet_Add c-api for frozenset objects
+        assert _testcapi.pyset_add(frozenset(), 1) == frozenset([1])
+        frozen_set = frozenset()
+        self.assertRaises(SystemError, _testcapi.pyset_add, frozen_set, 1)


I don't understand why the second test fails, whereas the first succeed.

The second test fails because the argument frozen_set is not uniquely referenced. The error is raised here:

cpython/Objects/setobject.c

Line 2777 in ce4b0ed

(!PyFrozenSet_Check(anyset) || !_PyObject_IsUniquelyReferenced(anyset))) {

I will add a comment to the test

Lib/test/test_set.py

Co-authored-by: Victor Stinner <[email protected]>

eendebakpt added 3 commits October 16, 2025 21:36

Do not track frozenset objects with immutables

a3292c2

cleanup

cd294a6

cleanup

7e28cf2

eendebakpt requested a review from rhettinger as a code owner October 16, 2025 21:01

bedevere-app bot mentioned this pull request Oct 16, 2025

Disable tracking of frozenset objects with immutables in the GC #140232

Open

bedevere-app bot added the awaiting review label Oct 16, 2025

eendebakpt and others added 3 commits October 16, 2025 23:08

Merge branch 'main' into frozenset_immutable_tracking

30057a5

fix test

c4deb03

📜🤖 Added by blurb_it.

607237a

efimov-mikhail reviewed Oct 17, 2025

View reviewed changes

Objects/setobject.c Outdated Show resolved Hide resolved

Update Objects/setobject.c

2735a71

Co-authored-by: Mikhail Efimov <[email protected]>

sergey-miryanov reviewed Oct 17, 2025

View reviewed changes

Lib/test/test_sys.py Show resolved Hide resolved

sergey-miryanov approved these changes Oct 17, 2025

View reviewed changes

bedevere-app bot added awaiting core review and removed awaiting review labels Oct 17, 2025

eendebakpt mentioned this pull request Oct 22, 2025

gh-140476: Optimize PySet_Add() for frozenset in free-threading #140440

Open

eendebakpt added 2 commits October 24, 2025 12:41

make sure PySet_Add tracks frozensets if needed

c05db54

Merge branch 'frozenset_immutable_tracking' of github.com:eendebakpt/…

7f6bc4b

…cpython into frozenset_immutable_tracking

sergey-miryanov reviewed Oct 24, 2025

View reviewed changes

Modules/_testcapimodule.c Show resolved Hide resolved

sergey-miryanov reviewed Oct 24, 2025

View reviewed changes

eendebakpt added 2 commits October 24, 2025 14:30

review comment

0b97604

Merge branch 'main' into frozenset_immutable_tracking

948daed

use _testcapi for testing

08e22c3

whitespace

62afc76

Merge branch 'main' into frozenset_immutable_tracking

eab653e

efimov-mikhail reviewed Oct 26, 2025

View reviewed changes

eendebakpt and others added 3 commits October 26, 2025 21:35

Apply suggestions from code review

37fc61d

Co-authored-by: Mikhail Efimov <[email protected]>

review comment

4f8bda7

Merge branch 'frozenset_immutable_tracking' of github.com:eendebakpt/…

e9d42b4

…cpython into frozenset_immutable_tracking

vstinner reviewed Oct 28, 2025

View reviewed changes

eendebakpt and others added 3 commits October 28, 2025 21:57

Apply suggestions from code review

4b39149

Co-authored-by: Victor Stinner <[email protected]>

review comments

2859802

review comments

08f43c5

		PyObject set = NULL, empty_tuple=NULL, *tracked_object;


		tracked_object = PyImport_ImportModule("sys");

Uh oh!

gh-140232: Do not track frozenset objects with immutables #140234

Are you sure you want to change the base?

gh-140232: Do not track frozenset objects with immutables #140234

Conversation

eendebakpt commented Oct 16, 2025 • edited by efimov-mikhail Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sergey-miryanov commented Oct 17, 2025 • edited by efimov-mikhail Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eendebakpt commented Oct 17, 2025

Uh oh!

Uh oh!

sergey-miryanov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vstinner commented Oct 24, 2025

Uh oh!

eendebakpt commented Oct 24, 2025

Uh oh!

sergey-miryanov commented Oct 26, 2025

Uh oh!

eendebakpt commented Oct 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

efimov-mikhail Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eendebakpt Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eendebakpt commented Oct 16, 2025 •

edited by efimov-mikhail

Loading

sergey-miryanov commented Oct 17, 2025 •

edited by efimov-mikhail

Loading

efimov-mikhail Oct 28, 2025 •

edited

Loading

eendebakpt Oct 28, 2025 •

edited

Loading