Skip to content

Commit 89d5c7c

Browse files
authored
Merge branch 'pandas-dev:main' into main
2 parents d4f39d5 + e4a03b6 commit 89d5c7c

25 files changed

+263
-322
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ details, see the commit logs at https://github.com/pandas-dev/pandas.
115115
## Dependencies
116116
- [NumPy - Adds support for large, multi-dimensional arrays, matrices and high-level mathematical functions to operate on these arrays](https://www.numpy.org)
117117
- [python-dateutil - Provides powerful extensions to the standard datetime module](https://dateutil.readthedocs.io/en/stable/index.html)
118-
- [pytz - Brings the Olson tz database into Python which allows accurate and cross platform timezone calculations](https://github.com/stub42/pytz)
118+
- [tzdata - Provides an IANA time zone database](https://tzdata.readthedocs.io/en/latest/)
119119

120120
See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies) for minimum supported versions of required, recommended and optional dependencies.
121121

ci/code_checks.sh

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,9 +58,7 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
5858

5959
MSG='Python and Cython Doctests' ; echo "$MSG"
6060
python -c 'import pandas as pd; pd.test(run_doctests=True)'
61-
# TEMP don't let doctests fail the build until all string dtype changes are fixed
62-
# RET=$(($RET + $?)) ; echo "$MSG" "DONE"
63-
echo "$MSG" "DONE"
61+
RET=$(($RET + $?)) ; echo "$MSG" "DONE"
6462

6563
fi
6664

ci/deps/actions-311-downstream_compat.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,7 @@ dependencies:
5050
- pytz>=2023.4
5151
- pyxlsb>=1.0.10
5252
- s3fs>=2023.12.2
53-
# TEMP upper pin for scipy (https://github.com/statsmodels/statsmodels/issues/9584)
54-
- scipy>=1.12.0,<1.16
53+
- scipy>=1.12.0
5554
- sqlalchemy>=2.0.0
5655
- tabulate>=0.9.0
5756
- xarray>=2024.1.1

doc/source/user_guide/migration-3-strings.rst

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -118,12 +118,17 @@ through the ``str`` accessor will work the same:
118118
Overview of behavior differences and how to address them
119119
---------------------------------------------------------
120120

121-
The dtype is no longer object dtype
122-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
121+
The dtype is no longer a numpy "object" dtype
122+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123123

124124
When inferring or reading string data, the data type of the resulting DataFrame
125125
column or Series will silently start being the new ``"str"`` dtype instead of
126-
``"object"`` dtype, and this can have some impact on your code.
126+
the numpy ``"object"`` dtype, and this can have some impact on your code.
127+
128+
The new string dtype is a pandas data type ("extension dtype"), and no longer a
129+
numpy ``np.dtype`` instance. Therefore, passing the dtype of a string column to
130+
numpy functions will no longer work (e.g. passing it to a ``dtype=`` argument
131+
of a numpy function, or using ``np.issubdtype`` to check the dtype).
127132

128133
Checking the dtype
129134
^^^^^^^^^^^^^^^^^^

doc/source/whatsnew/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Version 2.3
2424
.. toctree::
2525
:maxdepth: 2
2626

27+
v2.3.2
2728
v2.3.1
2829
v2.3.0
2930

doc/source/whatsnew/v2.3.2.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
.. _whatsnew_232:
2+
3+
What's new in 2.3.2 (August XX, 2025)
4+
-------------------------------------
5+
6+
These are the changes in pandas 2.3.2. See :ref:`release` for a full changelog
7+
including other versions of pandas.
8+
9+
{{ header }}
10+
11+
.. ---------------------------------------------------------------------------
12+
.. _whatsnew_232.string_fixes:
13+
14+
Improvements and fixes for the StringDtype
15+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16+
17+
Most changes in this release are related to :class:`StringDtype` which will
18+
become the default string dtype in pandas 3.0. See
19+
:ref:`whatsnew_230.upcoming_changes` for more details.
20+
21+
.. _whatsnew_232.string_fixes.bugs:
22+
23+
Bug fixes
24+
^^^^^^^^^
25+
- Fix :meth:`~DataFrame.to_json` with ``orient="table"`` to correctly use the
26+
"string" type in the JSON Table Schema for :class:`StringDtype` columns
27+
(:issue:`61889`)
28+
29+
30+
.. ---------------------------------------------------------------------------
31+
.. _whatsnew_232.contributors:
32+
33+
Contributors
34+
~~~~~~~~~~~~

pandas/core/algorithms.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -391,11 +391,11 @@ def unique(values):
391391
392392
>>> pd.unique(pd.Series(pd.Categorical(list("baabc"))))
393393
['b', 'a', 'c']
394-
Categories (3, object): ['a', 'b', 'c']
394+
Categories (3, str): ['a', 'b', 'c']
395395
396396
>>> pd.unique(pd.Series(pd.Categorical(list("baabc"), categories=list("abc"))))
397397
['b', 'a', 'c']
398-
Categories (3, object): ['a', 'b', 'c']
398+
Categories (3, str): ['a', 'b', 'c']
399399
400400
An ordered Categorical preserves the category ordering.
401401
@@ -405,7 +405,7 @@ def unique(values):
405405
... )
406406
... )
407407
['b', 'a', 'c']
408-
Categories (3, object): ['a' < 'b' < 'c']
408+
Categories (3, str): ['a' < 'b' < 'c']
409409
410410
An array of tuples
411411
@@ -751,7 +751,7 @@ def factorize(
751751
array([0, 0, 1])
752752
>>> uniques
753753
['a', 'c']
754-
Categories (3, str): [a, b, c]
754+
Categories (3, str): ['a', 'b', 'c']
755755
756756
Notice that ``'b'`` is in ``uniques.categories``, despite not being
757757
present in ``cat.values``.

pandas/core/arrays/base.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1688,13 +1688,13 @@ def factorize(
16881688
>>> cat = pd.Categorical(['a', 'b', 'c'])
16891689
>>> cat
16901690
['a', 'b', 'c']
1691-
Categories (3, object): ['a', 'b', 'c']
1691+
Categories (3, str): ['a', 'b', 'c']
16921692
>>> cat.repeat(2)
16931693
['a', 'a', 'b', 'b', 'c', 'c']
1694-
Categories (3, object): ['a', 'b', 'c']
1694+
Categories (3, str): ['a', 'b', 'c']
16951695
>>> cat.repeat([1, 2, 3])
16961696
['a', 'b', 'b', 'c', 'c', 'c']
1697-
Categories (3, object): ['a', 'b', 'c']
1697+
Categories (3, str): ['a', 'b', 'c']
16981698
"""
16991699

17001700
@Substitution(klass="ExtensionArray")

0 commit comments

Comments
 (0)