-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
PDEP-15: Reject PDEP-10 #58623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDEP-15: Reject PDEP-10 #58623
Changes from 1 commit
98eb85a
5e451db
2af5632
6e4efe5
45754bf
7833637
1ccca56
1b3bdee
e52e2e7
fef0c92
e5de753
c159851
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,48 @@ | ||
# PDEP-10: PyArrow as a required dependency for default string inference implementation | ||
|
||
- Created: 17 April 2023 | ||
- Status: Accepted | ||
- Created: 17 April 2023 (updated May 8, 2024) | ||
- Status: Rejected | ||
- Discussion: [#52711](https://github.com/pandas-dev/pandas/pull/52711) | ||
[#52509](https://github.com/pandas-dev/pandas/issues/52509) | ||
- Author: [Matthew Roeschke](https://github.com/mroeschke) | ||
[Patrick Hoefler](https://github.com/phofl) | ||
- Revision: 1 | ||
- Revision: 2 | ||
|
||
# Note | ||
|
||
This PDEP was originally accepted on May 8, 2023. However, after reviewing feedback posted | ||
on the feedback issue [#54466](https://github.com/pandas-dev/pandas/issues/54466), we, the members of | ||
the core team, have not decided with moving forward with this PDEP for pandas 3.0. | ||
Dr-Irv marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
The primary reasons for rejecting this PDEP are twofold: | ||
|
||
1) Requiring pyarrow as a dependency causes installation problems. | ||
- Pyarrow does not fit or has a hard time fitting in space-constrained environments | ||
|
||
such as AWS Lambda and WASM, due to its large size of around ~40 MB for a compiled wheel | ||
(which is larger than pandas' own wheel sizes) | ||
- Installation of pyarrow is not possible on some platforms. We provide support for some | ||
less widely used platforms such as Alpine Linux (and there is third party support for pandas in | ||
pyodide, a WASM distribution of pandas), both of which pyarrow does not provide wheels for. | ||
|
||
While both of these reasons are mentioned in the drawbacks section of this PDEP, at the time of the writing | ||
of the PDEP, we underestimated the impact this would have on users, and also downstream developers. | ||
|
||
2) Many of the benefits presented in this PDEP can be materialized even with payrrow as an optional dependency. | ||
|
||
|
||
For example, as detailed in PDEP-14, it is possible to create a new string data type with the same semantics | ||
|
||
as our current default object string data type, but that allows users to experience faster performance and memory savings | ||
compared to the object strings. | ||
|
||
While we've decided to not move forward with requiring pyarrow in pandas 3.0, the rejection of this PDEP | ||
does not mean that we are abandoning pyarrow support and integration in pandas. We, as the core team, still believe | ||
that adopting support for pyarrow arrays and data types in more of pandas will lead to greater interoperability with the | ||
ecosystem and better performance for users. Furthermore, a lot of the drawbacks, such as the large installation size of pyarrow | ||
and the lack of support for certain platforms, can be solved, and potential solutions have been proposed for them, allowing us | ||
to potentially revisit this decision in the future. | ||
|
||
However, at this point in time, it is clear that we are not ready to require pyarrow | ||
as a dependency in pandas. | ||
|
||
|
||
## Abstract | ||
|
||
|
@@ -210,6 +246,7 @@ before releasing a new pandas version. | |
|
||
- 17 April 2023: Initial version | ||
- 8 May 2023: Changed proposal to make pyarrow required in pandas 3.0 instead of 2.1 | ||
- 8 May 2024: Changed status to rejected | ||
|
||
[^1] <https://pandas.pydata.org/docs/development/roadmap.html#apache-arrow-interoperability> | ||
[^2] <https://arrow.apache.org/powered_by/> |
Uh oh!
There was an error while loading. Please reload this page.