Skip to content

Feature Request: Support Arrow PyCapsule Interface #858

@henryharbeck

Description

@henryharbeck

Describe your feature request

The Arrow project created the Arrow PyCapsule Interface, a protocol for sharing Arrow data in Python. It enables safe Arrow data interchange without requiring the use of PyArrow. Any library implementing this protocol can exchange data via PyCapsules (safe wrappers of C pointers), where data producer and consumer don't need prior knowledge of each other.

PyArrow is an extremely large dependency, which makes it a shame to have installed just to return a Polars DataFrame. Supporting the Arrow PyCapsule Interface avoids this, and also provides support for libraries supporting the interface, such as DuckDB, pandas, ibis, PyArrow, Polars and many more...

Connector-x could add a return_type to connectorx.read_sql where the type implements __arrow_c_stream__ to export a C stream of Arrow data to Python. Other libraries can natively instantiate from this object without requiring dependencies outside connector-x and itself. Existing Polars and PyArrow return types could also natively take advantage of this (i.e., dropping the PyArrow requirement for Polars).

Thanks in advance for your consideration

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions