Skip to content

Conversation

Liam3851
Copy link
Contributor

@Liam3851 Liam3851 commented Jul 3, 2025

@Liam3851 Liam3851 marked this pull request as ready for review July 3, 2025 23:09
@Liam3851 Liam3851 changed the title Fix unpickling of string dtypes of legacy pandas versions BUG: Fix unpickling of string dtypes of legacy pandas versions Jul 4, 2025
@jorisvandenbossche jorisvandenbossche added Bug Strings String extension data type and string data IO Pickle read_pickle, to_pickle labels Jul 4, 2025
@jorisvandenbossche jorisvandenbossche added this to the 2.3.1 milestone Jul 4, 2025
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Liam3851 thanks a lot for the bug report and the fix!

Looks perfect, and thanks for adding legacy data (we should probably also add some data for 2.0-2.2 ..).

Can you add a note in the doc/source/whatsnew/v2.3.1.rst file? Because we will want to backport this fix

@Liam3851
Copy link
Contributor Author

Liam3851 commented Jul 5, 2025

Thanks very much for the review @jorisvandenbossche, I've added pickles for 2.0-2.2 as extra checks and a whatsnew entry.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jorisvandenbossche jorisvandenbossche merged commit e5a1c10 into pandas-dev:main Jul 7, 2025
43 of 44 checks passed
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jul 7, 2025
@@ -59,6 +59,7 @@ Bug fixes
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
- Fixed bug in unpickling objects pickled in pandas versions pre-2.3.0 that used :class:`StringDtype` (:issue:`61763`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Liam3851 for the PR.

For future reference, by convention the trailing period is normally excluded. But no need to do anything as a follow up as will probably be changed when the release notes are tidied just prior to release.

jorisvandenbossche added a commit that referenced this pull request Jul 7, 2025
…pes of legacy pandas versions) (#61793)

Co-authored-by: David Krych <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Pickle read_pickle, to_pickle Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: StringDtype objects from pandas <2.3.0 cannot be reliably unpickled in 2.3.0.
3 participants