Skip to content

Conversation

efimov-mikhail
Copy link
Member

@efimov-mikhail efimov-mikhail commented Oct 16, 2025

@efimov-mikhail efimov-mikhail marked this pull request as draft October 16, 2025 11:09
@efimov-mikhail
Copy link
Member Author

This is just an expirement to improve Sergey's PR to solve this issue #139951

@efimov-mikhail efimov-mikhail changed the title Change in young object counting [gh-139951] Change in young GC object counting w/ no tracking immutable tuples Oct 16, 2025
@efimov-mikhail efimov-mikhail changed the title [gh-139951] Change in young GC object counting w/ no tracking immutable tuples gh-139951: Change in young GC object counting w/ no tracking immutable tuples Oct 16, 2025
@efimov-mikhail efimov-mikhail marked this pull request as ready for review October 16, 2025 12:21
@efimov-mikhail
Copy link
Member Author

cc @sergey-miryanov @markshannon

@efimov-mikhail efimov-mikhail changed the title gh-139951: Change in young GC object counting w/ no tracking immutable tuples gh-139951: Count only tracked objects in GC, don't track immutable tuples Oct 16, 2025
@efimov-mikhail efimov-mikhail changed the title gh-139951: Count only tracked objects in GC, don't track immutable tuples gh-139951: Do not track immutable tuples in GC Oct 16, 2025
@efimov-mikhail
Copy link
Member Author

Is it correct that pack(1, None) crashes for now in test_tuple_pack?

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the PR looks complete now. It's maybe time to run benchmark to measure the performance overhead of this change!

@vstinner
Copy link
Member

Is it correct that pack(1, None) crashes for now in test_tuple_pack?

That's because test C _testlimitedcapi function replaces Python None with C NULL, and PyTuple_Pack() arguments must not be NULL.

@eendebakpt
Copy link
Contributor

The approach here is to check for tuples of immutables at creation time. An alternative could be to do this inside the garbage collector (and untrack the tuple if needed). Was that considered?

The issue solved here is about a use case where many tuples are created in a tight loop. But there could potentially also be tuples of immutables in a normal workflow that visited many times by the GC.

One datapoint: after starting up my "normal" work and running this script

import gc
gc.collect()

def candidate(obj):
   return all( not gc.is_tracked(x) for x in obj )

for immutable_type in (tuple, frozenset):

    number_of_objects_tracked  =0 
    number_of_candidates=0
    number_of_immutable_candidates = 0
    
    for obj in gc.get_objects():
        number_of_objects_tracked += 1
        if type(obj) is immutable_type:
            number_of_candidates +=1        
            if  candidate(obj):
                number_of_immutable_candidates+=1
    
    print(f'type {immutable_type}')
    print(f'  {number_of_objects_tracked=}')
    print(f'  {number_of_candidates=}')
    print(f'  {number_of_immutable_candidates=}')

I get as output:

type <class 'tuple'>
  number_of_objects_tracked=703468
  number_of_candidates=82177
  number_of_immutable_candidates=582
type <class 'frozenset'>
  number_of_objects_tracked=703478
  number_of_candidates=4799
  number_of_immutable_candidates=4787

So for me the amount of tuples would not be reduced a lot, but it would be interesting to apply the same approach to the frozenset. Could someone run the above script on their "typical" workflow or application?

@sergey-miryanov
Copy link
Contributor

An alternative could be to do this inside the garbage collector (and untrack the tuple if needed). Was that considered?

This is already doing in the GC (see untrack_tuples in gc.c). Also, you may be interested in #139389

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants