Skip to content

Deactivating extension types or making ObjectId mapped type polars-compliant? #236

@Bonnevie

Description

@Bonnevie

I have a use-case where I want to extract data as an arrow table, save as parquet, and then later load it with polars.
My problem is that I cannot figure out how to avoid extension types, in particular for ObjectId. Status quo now is that the parquet gets stored with the extension type, which polars cannot read. Pymongoarrow functions like find_polars_all somehow manage this casting, but can it be achieved with find_arrow_all?

Is it possible to specify a datatype for ObjectId fields in the pymongoarrow schema, so that it gets recorded in the arrow table and parquet file as a binary data type that polars can read out of the box?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions