Skip to content

Do not track immutable tuples in PyTuple_Pack #139389

@sergey-miryanov

Description

@sergey-miryanov

Feature or enhancement

Proposal:

When we use PyTuple_Pack all objects already well constructed. If we know that they immutable we can skip tracking it in GC, because GC will untrack them eventually.

I have a PR ready and benchmark results:

Geometric mean: 1.01x faster (Win11 x64, 11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz, 48d0d0d)

All benchmarks:

+--------------------------+----------+------------------------+
| Benchmark                | main     | tuples                 |
+==========================+==========+========================+
| async_generators         | 435 ms   | 430 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| asyncio_tcp              | 750 ms   | 756 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| asyncio_tcp_ssl          | 1.91 sec | 1.92 sec: 1.01x slower |
+--------------------------+----------+------------------------+
| comprehensions           | 22.1 us  | 21.8 us: 1.01x faster  |
+--------------------------+----------+------------------------+
| bench_mp_pool            | 104 ms   | 103 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| bench_thread_pool        | 1.29 ms  | 1.27 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| coroutines               | 28.2 ms  | 27.7 ms: 1.02x faster  |
+--------------------------+----------+------------------------+
| coverage                 | 88.5 ms  | 86.3 ms: 1.02x faster  |
+--------------------------+----------+------------------------+
| crypto_pyaes             | 90.1 ms  | 86.7 ms: 1.04x faster  |
+--------------------------+----------+------------------------+
| deepcopy                 | 310 us   | 307 us: 1.01x faster   |
+--------------------------+----------+------------------------+
| deepcopy_memo            | 36.4 us  | 36.1 us: 1.01x faster  |
+--------------------------+----------+------------------------+
| deltablue                | 5.19 ms  | 4.85 ms: 1.07x faster  |
+--------------------------+----------+------------------------+
| django_template          | 45.5 ms  | 45.8 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| docutils                 | 2.47 sec | 2.45 sec: 1.01x faster |
+--------------------------+----------+------------------------+
| dulwich_log              | 86.2 ms  | 86.9 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| fannkuch                 | 449 ms   | 441 ms: 1.02x faster   |
+--------------------------+----------+------------------------+
| float                    | 85.3 ms  | 82.5 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| create_gc_cycles         | 1.17 ms  | 1.17 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| gc_traversal             | 2.97 ms  | 2.88 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| generators               | 43.0 ms  | 41.6 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| genshi_text              | 28.9 ms  | 28.7 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| go                       | 160 ms   | 153 ms: 1.04x faster   |
+--------------------------+----------+------------------------+
| hexiom                   | 8.39 ms  | 8.13 ms: 1.03x faster  |
+--------------------------+----------+------------------------+
| json_dumps               | 8.62 ms  | 8.69 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| logging_format           | 12.5 us  | 12.2 us: 1.02x faster  |
+--------------------------+----------+------------------------+
| logging_silent           | 139 ns   | 140 ns: 1.01x slower   |
+--------------------------+----------+------------------------+
| logging_simple           | 11.3 us  | 11.1 us: 1.01x faster  |
+--------------------------+----------+------------------------+
| mako                     | 14.2 ms  | 14.4 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| mdp                      | 1.47 sec | 1.50 sec: 1.02x slower |
+--------------------------+----------+------------------------+
| meteor_contest           | 104 ms   | 102 ms: 1.02x faster   |
+--------------------------+----------+------------------------+
| nbody                    | 114 ms   | 113 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| pickle_pure_python       | 439 us   | 436 us: 1.01x faster   |
+--------------------------+----------+------------------------+
| pprint_safe_repr         | 953 ms   | 916 ms: 1.04x faster   |
+--------------------------+----------+------------------------+
| pprint_pformat           | 1.95 sec | 1.88 sec: 1.04x faster |
+--------------------------+----------+------------------------+
| pyflate                  | 506 ms   | 492 ms: 1.03x faster   |
+--------------------------+----------+------------------------+
| python_startup           | 28.5 ms  | 27.4 ms: 1.04x faster  |
+--------------------------+----------+------------------------+
| python_startup_no_site   | 23.2 ms  | 22.2 ms: 1.05x faster  |
+--------------------------+----------+------------------------+
| raytrace                 | 361 ms   | 345 ms: 1.05x faster   |
+--------------------------+----------+------------------------+
| regex_compile            | 146 ms   | 146 ms: 1.01x faster   |
+--------------------------+----------+------------------------+
| regex_effbot             | 2.03 ms  | 2.02 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| regex_v8                 | 23.9 ms  | 22.7 ms: 1.06x faster  |
+--------------------------+----------+------------------------+
| richards                 | 66.1 ms  | 59.9 ms: 1.10x faster  |
+--------------------------+----------+------------------------+
| richards_super           | 71.6 ms  | 68.7 ms: 1.04x faster  |
+--------------------------+----------+------------------------+
| scimark_fft              | 300 ms   | 294 ms: 1.02x faster   |
+--------------------------+----------+------------------------+
| scimark_lu               | 135 ms   | 131 ms: 1.03x faster   |
+--------------------------+----------+------------------------+
| scimark_monte_carlo      | 83.3 ms  | 82.4 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| scimark_sor              | 157 ms   | 150 ms: 1.05x faster   |
+--------------------------+----------+------------------------+
| scimark_sparse_mat_mult  | 4.27 ms  | 4.35 ms: 1.02x slower  |
+--------------------------+----------+------------------------+
| spectral_norm            | 122 ms   | 118 ms: 1.03x faster   |
+--------------------------+----------+------------------------+
| sqlglot_optimize         | 60.7 ms  | 60.9 ms: 1.00x slower  |
+--------------------------+----------+------------------------+
| sympy_expand             | 501 ms   | 503 ms: 1.00x slower   |
+--------------------------+----------+------------------------+
| sympy_sum                | 143 ms   | 144 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| sympy_str                | 287 ms   | 292 ms: 1.02x slower   |
+--------------------------+----------+------------------------+
| telco                    | 7.26 ms  | 7.33 ms: 1.01x slower  |
+--------------------------+----------+------------------------+
| tomli_loads              | 2.23 sec | 2.25 sec: 1.01x slower |
+--------------------------+----------+------------------------+
| typing_runtime_protocols | 189 us   | 185 us: 1.02x faster   |
+--------------------------+----------+------------------------+
| unpack_sequence          | 65.4 ns  | 68.7 ns: 1.05x slower  |
+--------------------------+----------+------------------------+
| unpickle                 | 13.9 us  | 14.1 us: 1.01x slower  |
+--------------------------+----------+------------------------+
| unpickle_pure_python     | 303 us   | 300 us: 1.01x faster   |
+--------------------------+----------+------------------------+
| xml_etree_parse          | 130 ms   | 130 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| xml_etree_iterparse      | 107 ms   | 108 ms: 1.01x slower   |
+--------------------------+----------+------------------------+
| xml_etree_process        | 79.2 ms  | 78.6 ms: 1.01x faster  |
+--------------------------+----------+------------------------+
| Geometric mean           | (ref)    | 1.01x faster           |
+--------------------------+----------+------------------------+

Benchmark hidden because not significant (20): 2to3, chaos, deepcopy_reduce, genshi_xml, html5lib, json_loads, nqueens, pathlib, pickle, pickle_dict, pickle_list, pidigits, regex_dna, sqlglot_normalize, sqlglot_parse, sqlglot_transpile, sqlite_synth, sympy_integrate, unpickle_list, xml_etree_generate

It doesn't hurt performance, but can decrease number of objects in GC to check and untrack.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions