Skip to content

Commit 871b88a

Browse files
committed
Merge branch 'main' of https://github.com/pandas-dev/pandas into implement_pdep17
2 parents ee3f4f2 + e4a03b6 commit 871b88a

34 files changed

+762
-488
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ details, see the commit logs at https://github.com/pandas-dev/pandas.
115115
## Dependencies
116116
- [NumPy - Adds support for large, multi-dimensional arrays, matrices and high-level mathematical functions to operate on these arrays](https://www.numpy.org)
117117
- [python-dateutil - Provides powerful extensions to the standard datetime module](https://dateutil.readthedocs.io/en/stable/index.html)
118-
- [pytz - Brings the Olson tz database into Python which allows accurate and cross platform timezone calculations](https://github.com/stub42/pytz)
118+
- [tzdata - Provides an IANA time zone database](https://tzdata.readthedocs.io/en/latest/)
119119

120120
See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies) for minimum supported versions of required, recommended and optional dependencies.
121121

ci/code_checks.sh

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,9 +58,7 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
5858

5959
MSG='Python and Cython Doctests' ; echo "$MSG"
6060
python -c 'import pandas as pd; pd.test(run_doctests=True)'
61-
# TEMP don't let doctests fail the build until all string dtype changes are fixed
62-
# RET=$(($RET + $?)) ; echo "$MSG" "DONE"
63-
echo "$MSG" "DONE"
61+
RET=$(($RET + $?)) ; echo "$MSG" "DONE"
6462

6563
fi
6664

ci/deps/actions-311-downstream_compat.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,7 @@ dependencies:
5050
- pytz>=2023.4
5151
- pyxlsb>=1.0.10
5252
- s3fs>=2023.12.2
53-
# TEMP upper pin for scipy (https://github.com/statsmodels/statsmodels/issues/9584)
54-
- scipy>=1.12.0,<1.16
53+
- scipy>=1.12.0
5554
- sqlalchemy>=2.0.0
5655
- tabulate>=0.9.0
5756
- xarray>=2024.1.1

doc/source/development/maintaining.rst

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -388,8 +388,11 @@ Pre-release
388388

389389
3. Make sure the CI is green for the last commit of the branch being released.
390390

391-
4. If not a release candidate, make sure all backporting pull requests to the branch
392-
being released are merged.
391+
4. If not a release candidate, make sure all backporting pull requests to the
392+
branch being released are merged, and no merged pull requests are missing a
393+
backport (check the
394+
["Still Needs Manual Backport"](https://github.com/pandas-dev/pandas/labels/Still%20Needs%20Manual%20Backport)
395+
label for this).
393396

394397
5. Create a new issue and milestone for the version after the one being released.
395398
If the release was a release candidate, we would usually want to create issues and
@@ -435,6 +438,9 @@ which will be triggered when the tag is pushed.
435438

436439
scripts/download_wheels.sh <VERSION>
437440

441+
ATTENTION: this is currently not downloading *all* wheels, and you have to
442+
manually download the remainings wheels and sdist!
443+
438444
4. Create a `new GitHub release <https://github.com/pandas-dev/pandas/releases/new>`_:
439445

440446
- Tag: ``<version>``
@@ -462,15 +468,22 @@ Post-Release
462468
````````````
463469

464470
1. Update symlinks to stable documentation by logging in to our web server, and
465-
editing ``/var/www/html/pandas-docs/stable`` to point to ``version/<latest-version>``
466-
for major and minor releases, or ``version/<minor>`` to ``version/<patch>`` for
471+
editing ``/var/www/html/pandas-docs/stable`` to point to ``version/<X.Y>``
472+
for major and minor releases, or ``version/<X.Y>`` to ``version/<patch>`` for
467473
patch releases. The exact instructions are (replace the example version numbers by
468474
the appropriate ones for the version you are releasing):
469475

470476
- Log in to the server and use the correct user.
471477
- ``cd /var/www/html/pandas-docs/``
472-
- ``ln -sfn version/2.1 stable`` (for a major or minor release)
473-
- ``ln -sfn version/2.0.3 version/2.0`` (for a patch release)
478+
- For a major or minor release (assuming the ``/version/2.1.0/`` docs have been uploaded to the server):
479+
480+
- Create a new X.Y symlink to X.Y.Z: ``cd version; ln -sfn 2.1.0 2.1``
481+
- Update stable symlink to point to X.Y: ``ln -sfn version/2.1 stable``
482+
483+
- For a patch release (assuming the ``/version/2.1.3/`` docs have been uploaded to the server):
484+
485+
- Update the X.Y symlink to the new X.Y.Z patch version: ``cd version; ln -sfn 2.1.3 2.1``
486+
- (the stable symlink should already be pointing to the correct X.Y version)
474487

475488
2. If releasing a major or minor release, open a PR in our source code to update
476489
``web/pandas/versions.json``, to have the desired versions in the documentation

doc/source/user_guide/migration-3-strings.rst

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -118,12 +118,17 @@ through the ``str`` accessor will work the same:
118118
Overview of behavior differences and how to address them
119119
---------------------------------------------------------
120120

121-
The dtype is no longer object dtype
122-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
121+
The dtype is no longer a numpy "object" dtype
122+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123123

124124
When inferring or reading string data, the data type of the resulting DataFrame
125125
column or Series will silently start being the new ``"str"`` dtype instead of
126-
``"object"`` dtype, and this can have some impact on your code.
126+
the numpy ``"object"`` dtype, and this can have some impact on your code.
127+
128+
The new string dtype is a pandas data type ("extension dtype"), and no longer a
129+
numpy ``np.dtype`` instance. Therefore, passing the dtype of a string column to
130+
numpy functions will no longer work (e.g. passing it to a ``dtype=`` argument
131+
of a numpy function, or using ``np.issubdtype`` to check the dtype).
127132

128133
Checking the dtype
129134
^^^^^^^^^^^^^^^^^^

doc/source/whatsnew/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Version 2.3
2424
.. toctree::
2525
:maxdepth: 2
2626

27+
v2.3.2
2728
v2.3.1
2829
v2.3.0
2930

doc/source/whatsnew/v2.3.2.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
.. _whatsnew_232:
2+
3+
What's new in 2.3.2 (August XX, 2025)
4+
-------------------------------------
5+
6+
These are the changes in pandas 2.3.2. See :ref:`release` for a full changelog
7+
including other versions of pandas.
8+
9+
{{ header }}
10+
11+
.. ---------------------------------------------------------------------------
12+
.. _whatsnew_232.string_fixes:
13+
14+
Improvements and fixes for the StringDtype
15+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16+
17+
Most changes in this release are related to :class:`StringDtype` which will
18+
become the default string dtype in pandas 3.0. See
19+
:ref:`whatsnew_230.upcoming_changes` for more details.
20+
21+
.. _whatsnew_232.string_fixes.bugs:
22+
23+
Bug fixes
24+
^^^^^^^^^
25+
- Fix :meth:`~DataFrame.to_json` with ``orient="table"`` to correctly use the
26+
"string" type in the JSON Table Schema for :class:`StringDtype` columns
27+
(:issue:`61889`)
28+
29+
30+
.. ---------------------------------------------------------------------------
31+
.. _whatsnew_232.contributors:
32+
33+
Contributors
34+
~~~~~~~~~~~~

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. _whatsnew_300:
22

3-
What's new in 3.0.0 (Month XX, 2024)
3+
What's new in 3.0.0 (Month XX, 2025)
44
------------------------------------
55

66
These are the changes in pandas 3.0.0. See :ref:`release` for a full changelog

pandas/_libs/groupby.pyi

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ def group_sum(
6767
result_mask: np.ndarray | None = ...,
6868
min_count: int = ...,
6969
is_datetimelike: bool = ...,
70+
initial: object = ...,
7071
skipna: bool = ...,
7172
) -> None: ...
7273
def group_prod(

pandas/_libs/groupby.pyx

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -707,6 +707,7 @@ def group_sum(
707707
uint8_t[:, ::1] result_mask=None,
708708
Py_ssize_t min_count=0,
709709
bint is_datetimelike=False,
710+
object initial=0,
710711
bint skipna=True,
711712
) -> None:
712713
"""
@@ -725,9 +726,15 @@ def group_sum(
725726
raise ValueError("len(index) != len(labels)")
726727

727728
nobs = np.zeros((<object>out).shape, dtype=np.int64)
728-
# the below is equivalent to `np.zeros_like(out)` but faster
729-
sumx = np.zeros((<object>out).shape, dtype=(<object>out).base.dtype)
730-
compensation = np.zeros((<object>out).shape, dtype=(<object>out).base.dtype)
729+
if initial == 0:
730+
# the below is equivalent to `np.zeros_like(out)` but faster
731+
sumx = np.zeros((<object>out).shape, dtype=(<object>out).base.dtype)
732+
compensation = np.zeros((<object>out).shape, dtype=(<object>out).base.dtype)
733+
else:
734+
# in practice this path is only taken for strings to use empty string as initial
735+
assert sum_t is object
736+
sumx = np.full((<object>out).shape, initial, dtype=object)
737+
# object code path does not use `compensation`
731738

732739
N, K = (<object>values).shape
733740
if uses_mask:

0 commit comments

Comments
 (0)