-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189
Conversation
In the axis=1 case you need to transpose the whole dataframe, not block-by-block |
Hello! Any other changes I should make to this? |
pandas/core/generic.py
Outdated
ranks = values._rank( | ||
axis=axis_int, | ||
def ranker(blk_values): | ||
if isinstance(blk_values, ExtensionArray) and blk_values.ndim == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about 2D EAs? in particular DatetimeArray, TimedeltaArray
I removed the raise |
See Series.rank.__doc__. | ||
""" | ||
if axis != 0: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have cases that get here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the two 2d array tests I just added , test_rank2()
raises that Not Implemented Error when that line is there
=============================================== short test summary info ================================================
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_rank2 - NotImplementedError
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_2d_extension_array_datetime - NotImplementedError
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_2d_extension_array_timedelta - NotImplementedError
============================================ 3 failed, 131 passed in 5.58s =============================================
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the third "sub test" in test_rank2
:
df = DataFrame([["b", "c", "a"], ["a", "c", "b"]])
expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]])
result = df.rank(1, numeric_only=False)
ranks = values._rank( | ||
def ranker(blk_values): | ||
if axis_int == 0: | ||
blk_values = blk_values.T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can avoid transposing at the block level and only transpose at the dataframe level (L9317). then pass axis=blk_values.ndim-1 (which will require updating the EA method)
ranks_obj = self._constructor(ranks, **data._construct_axes_dict()) | ||
return ranks_obj.__finalize__(self, method="rank") | ||
if axis_int == 0: | ||
ranks = ranks.T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this transpose should also be avoidable
@sharkipelago can you address comments |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.I rewrote the method using
_mgr.apply
but still struggling to figure out why 2 oftest_rank.py
test cases are failing. Any help or tips are appreciated! They are as follows: