Skip to content
Merged
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ Interval

Indexing
^^^^^^^^
-
- Fixed bug in :meth:`Index.get_indexer` round-tripping through string dtype when ``infer_string`` is enabled (:issue:`55834`)
-

Missing
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -6556,6 +6556,9 @@ def _maybe_cast_listlike_indexer(self, target) -> Index:
"""
Analogue to maybe_cast_indexer for get_indexer instead of get_loc.
"""
if not hasattr(target, "dtype") and self.dtype == object:
# Avoid inference for object since we are casting back later anyway
return Index(target, dtype=self.dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this might prevent us from doing intentional inference in _maybe_downcast_for_indexing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel could you think of a concrete situation where this might happen?

(the tests are all passing, so don't seem to cover that?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at _maybe_downcast_for_indexing more closely, we are downcasting object dtype index to the other dtype, because this can give a performance improvement, right?

This might actually also be true for object/string combo, but I think correctness if more important, and if missing values are handled differently by downcasting, we should avoid that for now for this combo, until we can solve that better.

I will just make the check a bit more specific for object dtype then (even more ugly, but at least fixing this issue)

return ensure_index(target)

@final
Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/indexes/object/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,15 @@ def test_get_indexer_with_NA_values(
expected = np.array([0, 1, -1], dtype=np.intp)
tm.assert_numpy_array_equal(result, expected)

def test_get_indexer_infer_string_missing_values(self):
# ensure the passed list is not cast to string but to object so that
# the None value is matched in the index
# https://github.com/pandas-dev/pandas/issues/55834
idx = Index(["a", "b", None], dtype="object")
result = idx.get_indexer([None, "x"])
expected = np.array([2, -1], dtype=np.intp)
tm.assert_numpy_array_equal(result, expected)


class TestGetIndexerNonUnique:
def test_get_indexer_non_unique_nas(self, nulls_fixture):
Expand Down
Loading