Skip to content

Unboxed Internal list/tuple/dict/containers representation for use within CPython #729

@Fidget-Spinner

Description

@Fidget-Spinner

So it seems it's not possible to make ob_item in lists and tuples use _PyHeapRef or store an untagged int directly without breaking a whole lot of extensions capi-workgroup/decisions#64.

A possible alternative that maintains compatibility is to use an internal representation of a list, perhaps called _PyInternalListObject. This would be exactly the same as PyListObject, except it uses a _PyHeapRef * for ob_item.

When passing the object to external C code, we convert it to a normal PyListObject by simply swapping out the unboxed ints within the PyListObject to boxed ones, then passing it to the external code. This means _PyInternalListObject has a compatible memory layout with PyListObject. Conversion is only done once and in-place. So this is fast and efficient.

We can minimize the number of conversions required by relying on the specializing interpreter to tell us when we're calling out to unknown C functions.

To keep stackref conversion quick, we should reserve one bit meaning an "internal container". It seems 10 is still unused right now, and it's perfect for that. x1 represents "skip the refcount", which we don't want (we do want to refcount internal containers!). So the 10 bit can now mean "internal container" (meaning list, tuples, or dicts). After conversion, the tag just becomes 00.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions