BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

sharkipelago · 2025-08-25T17:01:21Z

closes BUG: DataFrame.rank does not return EA types when original type was an EADtype #52829
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

I rewrote the method using _mgr.apply but still struggling to figure out why 2 of test_rank.py test cases are failing. Any help or tips are appreciated! They are as follows:

self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84fae490>

    def test_rank2(self):
        df = DataFrame([[1, 3, 2], [1, 2, 3]])
        expected = DataFrame([[1.0, 3.0, 2.0], [1, 2, 3]]) / 3.0
        result = df.rank(1, pct=True)
        tm.assert_frame_equal(result, expected)

        df = DataFrame([[1, 3, 2], [1, 2, 3]])
        expected = df.rank(0) / 2.0
        result = df.rank(0, pct=True)
        tm.assert_frame_equal(result, expected)

        df = DataFrame([["b", "c", "a"], ["a", "c", "b"]])
        expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]])
        result = df.rank(1, numeric_only=False)
>       tm.assert_frame_equal(result, expected)

pandas/tests/frame/methods/test_rank.py:84:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/_libs/testing.pyx:53: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise_assert_detail(
E   AssertionError: DataFrame.iloc[:, 1] (column name="1") are different
E
E   DataFrame.iloc[:, 1] (column name="1") values are different (100.0 %)
E   [index]: [0, 1]
E   [left]:  [1.5, 1.5]
E   [right]: [3.0, 3.0]
E   At positional index 0, first diff: 1.5 != 3.0

pandas/_libs/testing.pyx:171: AssertionError

self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84faf290>
float_string_frame =                A         B         C         D  foo                   datetime       timedelta
foo_0   0.189053 -0.522...5.169587 1 days 00:00:01
foo_29 -0.967681  1.678419  0.765355  0.045808  bar 2025-08-25 12:57:35.169587 1 days 00:00:01

    def test_rank_mixed_frame(self, float_string_frame):
        float_string_frame["datetime"] = datetime.now()
        float_string_frame["timedelta"] = timedelta(days=1, seconds=1)

        float_string_frame.rank(numeric_only=False)
>       with pytest.raises(TypeError, match="not supported between instances of"):
E       Failed: DID NOT RAISE <class 'TypeError'>

doc/source/whatsnew/v3.0.0.rst

jbrockmendel · 2025-08-25T18:15:16Z

In the axis=1 case you need to transpose the whole dataframe, not block-by-block

…n-array

doc/source/whatsnew/v3.0.0.rst

…n-array

sharkipelago · 2025-09-02T20:38:58Z

Hello! Any other changes I should make to this?

jbrockmendel · 2025-09-03T15:39:11Z

pandas/core/generic.py

-                ranks = values._rank(
-                    axis=axis_int,
+        def ranker(blk_values):
+            if isinstance(blk_values, ExtensionArray) and blk_values.ndim == 1:


what about 2D EAs? in particular DatetimeArray, TimedeltaArray

pandas/tests/frame/methods/test_rank.py

…n-array

sharkipelago · 2025-09-09T01:12:02Z

I removed the raise axis != 0 in arrays/base.py because I thought it might be safe now with the other changes but not sure if there are other things to consider?

jbrockmendel · 2025-09-10T21:23:46Z

pandas/core/arrays/base.py

        See Series.rank.__doc__.
        """
-        if axis != 0:
-            raise NotImplementedError


do we have cases that get here?

Besides the two 2d array tests I just added , test_rank2() raises that Not Implemented Error when that line is there

=============================================== short test summary info ================================================
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_rank2 - NotImplementedError
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_2d_extension_array_datetime - NotImplementedError
FAILED pandas/tests/frame/methods/test_rank.py::TestRank::test_2d_extension_array_timedelta - NotImplementedError
============================================ 3 failed, 131 passed in 5.58s =============================================

For the third "sub test" in test_rank2:

df = DataFrame([["b", "c", "a"], ["a", "c", "b"]]) expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]]) result = df.rank(1, numeric_only=False)

jbrockmendel · 2025-09-19T19:27:11Z

pandas/core/generic.py

-                ranks = values._rank(
+        def ranker(blk_values):
+            if axis_int == 0:
+                blk_values = blk_values.T


i think we can avoid transposing at the block level and only transpose at the dataframe level (L9317). then pass axis=blk_values.ndim-1 (which will require updating the EA method)

jbrockmendel · 2025-09-19T19:27:24Z

pandas/core/generic.py

-            ranks_obj = self._constructor(ranks, **data._construct_axes_dict())
-            return ranks_obj.__finalize__(self, method="rank")
+            if axis_int == 0:
+                ranks = ranks.T


this transpose should also be avoidable

jbrockmendel · 2025-10-16T21:24:08Z

@sharkipelago can you address comments

jbrockmendel · 2025-10-29T17:08:37Z

@sharkipelago can you address comments

sharkipelago added 3 commits August 25, 2025 12:31

using _mgr apply with 2 failing tests

98a70df

transposed blocks to keep axis_int parameter intact

af9f3de

added rst

00ef2ea

jbrockmendel reviewed Aug 25, 2025

View reviewed changes

doc/source/whatsnew/v3.0.0.rst Outdated Show resolved Hide resolved

sharkipelago added 6 commits August 25, 2025 16:53

updated rst

f78ffa5

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

5124513

…n-array

dataframe level transpose

178f4e3

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

3a94c7f

…n-array

removed redundant ndim checks

94893f0

added pytest skips if no pyarrow module

7ee79ef

jbrockmendel reviewed Aug 26, 2025

View reviewed changes

doc/source/whatsnew/v3.0.0.rst Outdated Show resolved Hide resolved

sharkipelago added 2 commits August 26, 2025 13:59

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

601dc39

…n-array

corrected to dtype_backend

2e7ca27

jbrockmendel reviewed Sep 3, 2025

View reviewed changes

pandas/tests/frame/methods/test_rank.py Outdated Show resolved Hide resolved

sharkipelago added 3 commits September 8, 2025 15:11

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

011c293

…n-array

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

15922e8

…n-array

2d extension arrays

fce155a

jbrockmendel reviewed Sep 10, 2025

View reviewed changes

jbrockmendel reviewed Sep 19, 2025

View reviewed changes

Uh oh!

BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

Are you sure you want to change the base?

BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

Uh oh!

Conversation

sharkipelago commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jbrockmendel commented Aug 25, 2025

Uh oh!

Uh oh!

sharkipelago commented Sep 2, 2025

Uh oh!

jbrockmendel Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sharkipelago commented Sep 9, 2025

Uh oh!

jbrockmendel Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

sharkipelago Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sharkipelago Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Oct 16, 2025

Uh oh!

jbrockmendel commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sharkipelago commented Aug 25, 2025 •

edited

Loading

sharkipelago Sep 10, 2025 •

edited

Loading

sharkipelago Sep 10, 2025 •

edited

Loading