Skip to content

Commit 504d7f8

Browse files
authored
Merge branch '2.3.x' into ci/wheels/39
2 parents f06f0cc + 3c86d66 commit 504d7f8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+693
-208
lines changed

ci/deps/actions-310-minimum_versions.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ dependencies:
1818
- pytest-xdist>=2.2.0
1919
- pytest-localserver>=0.7.1
2020
- pytest-qt>=4.2.0
21-
- boto3
21+
- boto3=1.24.59
2222

2323
# required dependencies
2424
- python-dateutil=2.8.2

ci/deps/actions-310.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ dependencies:
1515
- pytest-cov
1616
- pytest-xdist>=2.2.0
1717
- pytest-qt>=4.2.0
18-
- boto3
18+
- boto3=1.37.3
1919

2020
# required dependencies
2121
- python-dateutil

ci/deps/actions-311-downstream_compat.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ dependencies:
1717
- pytest-xdist>=2.2.0
1818
- pytest-localserver>=0.7.1
1919
- pytest-qt>=4.2.0
20-
- boto3
20+
- boto3=1.37.3
2121

2222
# required dependencies
2323
- python-dateutil

ci/deps/actions-311.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ dependencies:
1515
- pytest-cov
1616
- pytest-xdist>=2.2.0
1717
- pytest-qt>=4.2.0
18-
- boto3
18+
- boto3=1.37.3
1919

2020
# required dependencies
2121
- python-dateutil

ci/deps/actions-312.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,11 @@ dependencies:
1515
- pytest-cov
1616
- pytest-xdist>=2.2.0
1717
- pytest-qt>=4.2.0
18-
- boto3
18+
- boto3=1.37.3
1919

2020
# required dependencies
2121
- python-dateutil
22-
- numpy
22+
- numpy=2.2
2323
# pytz 2024.2 timezones cause wrong results
2424
- pytz<2024.2
2525

doc/source/whatsnew/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Version 2.3
1616
.. toctree::
1717
:maxdepth: 2
1818

19+
v2.3.1
1920
v2.3.0
2021

2122
Version 2.2

doc/source/whatsnew/v2.3.0.rst

Lines changed: 0 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -30,40 +30,6 @@ Other enhancements
3030
- The :meth:`~Series.cumsum`, :meth:`~Series.cummin`, and :meth:`~Series.cummax` reductions are now implemented for :class:`StringDtype` columns (:issue:`60633`)
3131
- The :meth:`~Series.sum` reduction is now implemented for :class:`StringDtype` columns (:issue:`59853`)
3232

33-
.. ---------------------------------------------------------------------------
34-
.. _whatsnew_230.notable_bug_fixes:
35-
36-
Notable bug fixes
37-
~~~~~~~~~~~~~~~~~
38-
39-
These are bug fixes that might have notable behavior changes.
40-
41-
.. _whatsnew_230.notable_bug_fixes.notable_bug_fix1:
42-
43-
notable_bug_fix1
44-
^^^^^^^^^^^^^^^^
45-
46-
In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
47-
48-
Increased minimum version for Python
49-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50-
51-
in determining the result dtype when there are different string dtypes compared. Some examples:
52-
53-
- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
54-
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
55-
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
56-
57-
.. _whatsnew_230.api_changes:
58-
59-
API changes
60-
~~~~~~~~~~~
61-
62-
- When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
63-
union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
64-
empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
65-
Index (:issue:`60797`)
66-
6733
.. ---------------------------------------------------------------------------
6834
.. _whatsnew_230.deprecations:
6935

@@ -86,8 +52,6 @@ Numeric
8652

8753
Strings
8854
^^^^^^^
89-
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
90-
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
9155
- Bug in :meth:`Series.__pos__` and :meth:`DataFrame.__pos__` where an ``Exception`` was not raised for :class:`StringDtype` with ``storage="pyarrow"`` (:issue:`60710`)
9256
- Bug in :meth:`Series.rank` for :class:`StringDtype` with ``storage="pyarrow"`` that incorrectly returned integer results with ``method="average"`` and raised an error if it would truncate results (:issue:`59768`)
9357
- Bug in :meth:`Series.replace` with :class:`StringDtype` when replacing with a non-string value was not upcasting to ``object`` dtype (:issue:`60282`)

doc/source/whatsnew/v2.3.1.rst

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
.. _whatsnew_231:
2+
3+
What's new in 2.3.1 (Month XX, 2025)
4+
------------------------------------
5+
6+
These are the changes in pandas 2.3.1. See :ref:`release` for a full changelog
7+
including other versions of pandas.
8+
9+
{{ header }}
10+
11+
.. ---------------------------------------------------------------------------
12+
.. _whatsnew_231.string_fixes:
13+
14+
Improvements and fixes for the StringDtype
15+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16+
17+
.. _whatsnew_231.string_fixes.string_comparisons:
18+
19+
Comparisons between different string dtypes
20+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21+
22+
In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
23+
24+
object < (python, NaN) < (pyarrow, NaN) < (python, NA) < (pyarrow, NA)
25+
26+
in determining the result dtype when there are different string dtypes compared. Some examples:
27+
28+
- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
29+
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
30+
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
31+
32+
.. _whatsnew_231.string_fixes.ignore_empty:
33+
34+
Index set operations ignore empty RangeIndex and object dtype Index
35+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
36+
37+
When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
38+
union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
39+
empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
40+
Index (:issue:`60797`).
41+
42+
This ensures that combining such empty Index with strings will infer the string dtype
43+
correctly, rather than defaulting to ``object`` dtype. For example:
44+
45+
.. code-block:: python
46+
47+
>>> pd.options.mode.infer_string = True
48+
>>> df = pd.DataFrame()
49+
>>> df.columns.dtype
50+
dtype('int64') # default RangeIndex for empty columns
51+
>>> df["a"] = [1, 2, 3]
52+
>>> df.columns.dtype
53+
<StringDtype(na_value=nan)> # new columns use string dtype instead of object dtype
54+
55+
.. _whatsnew_231.string_fixes.bugs:
56+
57+
Bug fixes
58+
^^^^^^^^^
59+
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
60+
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
61+
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
62+
63+
64+
.. _whatsnew_231.regressions:
65+
66+
Fixed regressions
67+
~~~~~~~~~~~~~~~~~
68+
-
69+
70+
.. ---------------------------------------------------------------------------
71+
.. _whatsnew_231.bug_fixes:
72+
73+
Bug fixes
74+
~~~~~~~~~
75+
-
76+
77+
.. ---------------------------------------------------------------------------
78+
.. _whatsnew_231.other:
79+
80+
Other
81+
~~~~~
82+
-
83+
84+
.. ---------------------------------------------------------------------------
85+
.. _whatsnew_231.contributors:
86+
87+
Contributors
88+
~~~~~~~~~~~~

pandas/core/arrays/arrow/array.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@
3737
infer_dtype_from_scalar,
3838
)
3939
from pandas.core.dtypes.common import (
40-
CategoricalDtype,
4140
is_array_like,
4241
is_bool_dtype,
4342
is_float_dtype,
@@ -725,9 +724,7 @@ def __setstate__(self, state) -> None:
725724

726725
def _cmp_method(self, other, op):
727726
pc_func = ARROW_CMP_FUNCS[op.__name__]
728-
if isinstance(
729-
other, (ArrowExtensionArray, np.ndarray, list, BaseMaskedArray)
730-
) or isinstance(getattr(other, "dtype", None), CategoricalDtype):
727+
if isinstance(other, (ExtensionArray, np.ndarray, list)):
731728
try:
732729
result = pc_func(self._pa_array, self._box_pa(other))
733730
except pa.ArrowNotImplementedError:
@@ -1926,7 +1923,9 @@ def _explode(self):
19261923
"""
19271924
# child class explode method supports only list types; return
19281925
# default implementation for non list types.
1929-
if not pa.types.is_list(self.dtype.pyarrow_dtype):
1926+
if not hasattr(self.dtype, "pyarrow_dtype") or (
1927+
not pa.types.is_list(self.dtype.pyarrow_dtype)
1928+
):
19301929
return super()._explode()
19311930
values = self
19321931
counts = pa.compute.list_value_length(values._pa_array)

pandas/core/arrays/base.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2386,7 +2386,14 @@ def _groupby_op(
23862386
if op.how not in ["any", "all"]:
23872387
# Fail early to avoid conversion to object
23882388
op._get_cython_function(op.kind, op.how, np.dtype(object), False)
2389-
npvalues = self.to_numpy(object, na_value=np.nan)
2389+
2390+
arr = self
2391+
if op.how == "sum":
2392+
# https://github.com/pandas-dev/pandas/issues/60229
2393+
# All NA should result in the empty string.
2394+
if min_count == 0:
2395+
arr = arr.fillna("")
2396+
npvalues = arr.to_numpy(object, na_value=np.nan)
23902397
else:
23912398
raise NotImplementedError(
23922399
f"function is not implemented for this dtype: {self.dtype}"

0 commit comments

Comments
 (0)