-
-
Notifications
You must be signed in to change notification settings - Fork 33k
Open
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Description
Feature or enhancement
Proposal:
When we use PyTuple_Pack
all objects already well constructed. If we know that they immutable we can skip tracking it in GC, because GC will untrack them eventually.
I have a PR ready and benchmark results:
Geometric mean: 1.01x faster (Win11 x64, 11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz, 48d0d0d)
All benchmarks:
+--------------------------+----------+------------------------+
| Benchmark | main | tuples |
+==========================+==========+========================+
| async_generators | 435 ms | 430 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| asyncio_tcp | 750 ms | 756 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| asyncio_tcp_ssl | 1.91 sec | 1.92 sec: 1.01x slower |
+--------------------------+----------+------------------------+
| comprehensions | 22.1 us | 21.8 us: 1.01x faster |
+--------------------------+----------+------------------------+
| bench_mp_pool | 104 ms | 103 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| bench_thread_pool | 1.29 ms | 1.27 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| coroutines | 28.2 ms | 27.7 ms: 1.02x faster |
+--------------------------+----------+------------------------+
| coverage | 88.5 ms | 86.3 ms: 1.02x faster |
+--------------------------+----------+------------------------+
| crypto_pyaes | 90.1 ms | 86.7 ms: 1.04x faster |
+--------------------------+----------+------------------------+
| deepcopy | 310 us | 307 us: 1.01x faster |
+--------------------------+----------+------------------------+
| deepcopy_memo | 36.4 us | 36.1 us: 1.01x faster |
+--------------------------+----------+------------------------+
| deltablue | 5.19 ms | 4.85 ms: 1.07x faster |
+--------------------------+----------+------------------------+
| django_template | 45.5 ms | 45.8 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| docutils | 2.47 sec | 2.45 sec: 1.01x faster |
+--------------------------+----------+------------------------+
| dulwich_log | 86.2 ms | 86.9 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| fannkuch | 449 ms | 441 ms: 1.02x faster |
+--------------------------+----------+------------------------+
| float | 85.3 ms | 82.5 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| create_gc_cycles | 1.17 ms | 1.17 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| gc_traversal | 2.97 ms | 2.88 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| generators | 43.0 ms | 41.6 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| genshi_text | 28.9 ms | 28.7 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| go | 160 ms | 153 ms: 1.04x faster |
+--------------------------+----------+------------------------+
| hexiom | 8.39 ms | 8.13 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| json_dumps | 8.62 ms | 8.69 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| logging_format | 12.5 us | 12.2 us: 1.02x faster |
+--------------------------+----------+------------------------+
| logging_silent | 139 ns | 140 ns: 1.01x slower |
+--------------------------+----------+------------------------+
| logging_simple | 11.3 us | 11.1 us: 1.01x faster |
+--------------------------+----------+------------------------+
| mako | 14.2 ms | 14.4 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| mdp | 1.47 sec | 1.50 sec: 1.02x slower |
+--------------------------+----------+------------------------+
| meteor_contest | 104 ms | 102 ms: 1.02x faster |
+--------------------------+----------+------------------------+
| nbody | 114 ms | 113 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| pickle_pure_python | 439 us | 436 us: 1.01x faster |
+--------------------------+----------+------------------------+
| pprint_safe_repr | 953 ms | 916 ms: 1.04x faster |
+--------------------------+----------+------------------------+
| pprint_pformat | 1.95 sec | 1.88 sec: 1.04x faster |
+--------------------------+----------+------------------------+
| pyflate | 506 ms | 492 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| python_startup | 28.5 ms | 27.4 ms: 1.04x faster |
+--------------------------+----------+------------------------+
| python_startup_no_site | 23.2 ms | 22.2 ms: 1.05x faster |
+--------------------------+----------+------------------------+
| raytrace | 361 ms | 345 ms: 1.05x faster |
+--------------------------+----------+------------------------+
| regex_compile | 146 ms | 146 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| regex_effbot | 2.03 ms | 2.02 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| regex_v8 | 23.9 ms | 22.7 ms: 1.06x faster |
+--------------------------+----------+------------------------+
| richards | 66.1 ms | 59.9 ms: 1.10x faster |
+--------------------------+----------+------------------------+
| richards_super | 71.6 ms | 68.7 ms: 1.04x faster |
+--------------------------+----------+------------------------+
| scimark_fft | 300 ms | 294 ms: 1.02x faster |
+--------------------------+----------+------------------------+
| scimark_lu | 135 ms | 131 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| scimark_monte_carlo | 83.3 ms | 82.4 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| scimark_sor | 157 ms | 150 ms: 1.05x faster |
+--------------------------+----------+------------------------+
| scimark_sparse_mat_mult | 4.27 ms | 4.35 ms: 1.02x slower |
+--------------------------+----------+------------------------+
| spectral_norm | 122 ms | 118 ms: 1.03x faster |
+--------------------------+----------+------------------------+
| sqlglot_optimize | 60.7 ms | 60.9 ms: 1.00x slower |
+--------------------------+----------+------------------------+
| sympy_expand | 501 ms | 503 ms: 1.00x slower |
+--------------------------+----------+------------------------+
| sympy_sum | 143 ms | 144 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| sympy_str | 287 ms | 292 ms: 1.02x slower |
+--------------------------+----------+------------------------+
| telco | 7.26 ms | 7.33 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| tomli_loads | 2.23 sec | 2.25 sec: 1.01x slower |
+--------------------------+----------+------------------------+
| typing_runtime_protocols | 189 us | 185 us: 1.02x faster |
+--------------------------+----------+------------------------+
| unpack_sequence | 65.4 ns | 68.7 ns: 1.05x slower |
+--------------------------+----------+------------------------+
| unpickle | 13.9 us | 14.1 us: 1.01x slower |
+--------------------------+----------+------------------------+
| unpickle_pure_python | 303 us | 300 us: 1.01x faster |
+--------------------------+----------+------------------------+
| xml_etree_parse | 130 ms | 130 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| xml_etree_iterparse | 107 ms | 108 ms: 1.01x slower |
+--------------------------+----------+------------------------+
| xml_etree_process | 79.2 ms | 78.6 ms: 1.01x faster |
+--------------------------+----------+------------------------+
| Geometric mean | (ref) | 1.01x faster |
+--------------------------+----------+------------------------+
Benchmark hidden because not significant (20): 2to3, chaos, deepcopy_reduce, genshi_xml, html5lib, json_loads, nqueens, pathlib, pickle, pickle_dict, pickle_list, pidigits, regex_dna, sqlglot_normalize, sqlglot_parse, sqlglot_transpile, sqlite_synth, sympy_integrate, unpickle_list, xml_etree_generate
It doesn't hurt performance, but can decrease number of objects in GC to check and untrack.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
Metadata
Metadata
Assignees
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement