You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a use-case where I want to extract data as an arrow table, save as parquet, and then later load it with polars.
My problem is that I cannot figure out how to avoid extension types, in particular for ObjectId. Status quo now is that the parquet gets stored with the extension type, which polars cannot read. Pymongoarrow functions like find_polars_all somehow manage this casting, but can it be achieved with find_arrow_all?
Is it possible to specify a datatype for ObjectId fields in the pymongoarrow schema, so that it gets recorded in the arrow table and parquet file as a binary data type that polars can read out of the box?