Skip to content

Intern dataclass field names to improve performance #112653

@XuehaiPan

Description

@XuehaiPan

Feature or enhancement

Proposal:

Interning string literals can speed up dictionary lookup. Ref: sys.intern

sys.intern(string)

Enter string in the table of “interned” strings and return the interned string – which is string itself or a copy. Interning strings is useful to gain a little performance on dictionary lookup – if the keys in a dictionary are interned, and the lookup key is interned, the key comparisons (after hashing) can be done by a pointer compare instead of a string compare. Normally, the names used in Python programs are automatically interned, and the dictionaries used to hold module, class or instance attributes have interned keys.

We already intern the typename and field names for namedtuple.

typename = _sys.intern(str(typename))

field_names = tuple(map(_sys.intern, field_names))

We can make similar improvements for dataclass.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Labels

performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytopic-dataclassestype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions