Skip to content

Is there any interest in adding UUIDs? #45

@stevesimmons

Description

@stevesimmons

I was looking for a good, compact and efficient way to store UUIDs in Pandas DataFrames. The easy way is as columns of uuid.UUID objects (56 bytes each). Since UUIDs can be represented as 128 bits (16 bytes), it would be nice for a column to be a contiguous array.

As the cyberpandas IPv6 extension array also stores 128 bit wide IP addresses, I was thinking of leveraging the work done here for IPv6 for UUIDs.

Then a future potential step would be to make an extension type that supports any numpy "Sn" fixed width field, with efficient implementations of the low level Pandas array operations, plus a mechanism to easily register various high-level representation and accessor methods (e.g. IPv6, UUID, and so forth).

Tom, maybe can you say how you see this project evolving? Is it essentially "done" as it is today, with IPv4 and IPv6. Or as a place where similar extension arrays can be added, as semi-standard additions to the Pandas ecosystem?

Thanks
Stephen

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions