Skip to content

Commit 1627e2b

Browse files
authored
DOC: clarify Series.map behavior for categorical dtype (#62338)
1 parent ac82414 commit 1627e2b

File tree

2 files changed

+44
-0
lines changed

2 files changed

+44
-0
lines changed

pandas/core/arrays/categorical.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1585,6 +1585,22 @@ def map(
15851585
15861586
>>> cat.map({"a": "first", "b": "second"}, na_action=None)
15871587
Index(['first', 'second', nan], dtype='str')
1588+
1589+
The mapping function is applied to categories, not to each value. It is
1590+
therefore only called once per unique category, and the result reused for
1591+
all occurrences:
1592+
1593+
>>> cat = pd.Categorical(["a", "a", "b"])
1594+
>>> calls = []
1595+
>>> def f(x):
1596+
... calls.append(x)
1597+
... return x.upper()
1598+
>>> result = cat.map(f)
1599+
>>> result
1600+
['A', 'A', 'B']
1601+
Categories (2, str): ['A', 'B']
1602+
>>> calls
1603+
['a', 'b']
15881604
"""
15891605
assert callable(mapper) or is_dict_like(mapper)
15901606

pandas/core/series.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4419,6 +4419,34 @@ def map(
44194419
2 NaN
44204420
3 I am a rabbit
44214421
dtype: object
4422+
4423+
For categorical data, the function is only applied to the categories:
4424+
4425+
>>> s = pd.Series(list("cabaa"))
4426+
>>> s.map(print)
4427+
c
4428+
a
4429+
b
4430+
a
4431+
a
4432+
0 None
4433+
1 None
4434+
2 None
4435+
3 None
4436+
4 None
4437+
dtype: object
4438+
4439+
>>> s_cat = s.astype("category")
4440+
>>> s_cat.map(print) # function called once per unique category
4441+
a
4442+
b
4443+
c
4444+
0 None
4445+
1 None
4446+
2 None
4447+
3 None
4448+
4 None
4449+
dtype: object
44224450
"""
44234451
if func is None:
44244452
if "arg" in kwargs:

0 commit comments

Comments
 (0)