-
Notifications
You must be signed in to change notification settings - Fork 205
Description
Describe your feature request
The Arrow project created the Arrow PyCapsule Interface, a protocol for sharing Arrow data in Python. It enables safe Arrow data interchange without requiring the use of PyArrow. Any library implementing this protocol can exchange data via PyCapsules (safe wrappers of C pointers), where data producer and consumer don't need prior knowledge of each other.
PyArrow is an extremely large dependency, which makes it a shame to have installed just to return a Polars DataFrame. Supporting the Arrow PyCapsule Interface avoids this, and also provides support for libraries supporting the interface, such as DuckDB, pandas, ibis, PyArrow, Polars and many more...
Connector-x could add a return_type to connectorx.read_sql where the type implements __arrow_c_stream__ to export a C stream of Arrow data to Python. Other libraries can natively instantiate from this object without requiring dependencies outside connector-x and itself. Existing Polars and PyArrow return types could also natively take advantage of this (i.e., dropping the PyArrow requirement for Polars).
Thanks in advance for your consideration