Skip to content

Commit cb4cf25

Browse files
Merge remote-tracking branch 'upstream/main' into string-dtype-doc-migration-guide-np-dtype
2 parents 2f4e404 + 4f952b7 commit cb4cf25

File tree

14 files changed

+596
-387
lines changed

14 files changed

+596
-387
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ details, see the commit logs at https://github.com/pandas-dev/pandas.
115115
## Dependencies
116116
- [NumPy - Adds support for large, multi-dimensional arrays, matrices and high-level mathematical functions to operate on these arrays](https://www.numpy.org)
117117
- [python-dateutil - Provides powerful extensions to the standard datetime module](https://dateutil.readthedocs.io/en/stable/index.html)
118-
- [pytz - Brings the Olson tz database into Python which allows accurate and cross platform timezone calculations](https://github.com/stub42/pytz)
118+
- [tzdata - Provides an IANA time zone database](https://tzdata.readthedocs.io/en/latest/)
119119

120120
See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies) for minimum supported versions of required, recommended and optional dependencies.
121121

ci/deps/actions-311-downstream_compat.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,7 @@ dependencies:
5050
- pytz>=2023.4
5151
- pyxlsb>=1.0.10
5252
- s3fs>=2023.12.2
53-
# TEMP upper pin for scipy (https://github.com/statsmodels/statsmodels/issues/9584)
54-
- scipy>=1.12.0,<1.16
53+
- scipy>=1.12.0
5554
- sqlalchemy>=2.0.0
5655
- tabulate>=0.9.0
5756
- xarray>=2024.1.1

doc/source/development/maintaining.rst

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -388,8 +388,11 @@ Pre-release
388388

389389
3. Make sure the CI is green for the last commit of the branch being released.
390390

391-
4. If not a release candidate, make sure all backporting pull requests to the branch
392-
being released are merged.
391+
4. If not a release candidate, make sure all backporting pull requests to the
392+
branch being released are merged, and no merged pull requests are missing a
393+
backport (check the
394+
["Still Needs Manual Backport"](https://github.com/pandas-dev/pandas/labels/Still%20Needs%20Manual%20Backport)
395+
label for this).
393396

394397
5. Create a new issue and milestone for the version after the one being released.
395398
If the release was a release candidate, we would usually want to create issues and
@@ -435,6 +438,9 @@ which will be triggered when the tag is pushed.
435438

436439
scripts/download_wheels.sh <VERSION>
437440

441+
ATTENTION: this is currently not downloading *all* wheels, and you have to
442+
manually download the remainings wheels and sdist!
443+
438444
4. Create a `new GitHub release <https://github.com/pandas-dev/pandas/releases/new>`_:
439445

440446
- Tag: ``<version>``
@@ -462,15 +468,22 @@ Post-Release
462468
````````````
463469

464470
1. Update symlinks to stable documentation by logging in to our web server, and
465-
editing ``/var/www/html/pandas-docs/stable`` to point to ``version/<latest-version>``
466-
for major and minor releases, or ``version/<minor>`` to ``version/<patch>`` for
471+
editing ``/var/www/html/pandas-docs/stable`` to point to ``version/<X.Y>``
472+
for major and minor releases, or ``version/<X.Y>`` to ``version/<patch>`` for
467473
patch releases. The exact instructions are (replace the example version numbers by
468474
the appropriate ones for the version you are releasing):
469475

470476
- Log in to the server and use the correct user.
471477
- ``cd /var/www/html/pandas-docs/``
472-
- ``ln -sfn version/2.1 stable`` (for a major or minor release)
473-
- ``ln -sfn version/2.0.3 version/2.0`` (for a patch release)
478+
- For a major or minor release (assuming the ``/version/2.1.0/`` docs have been uploaded to the server):
479+
480+
- Create a new X.Y symlink to X.Y.Z: ``cd version; ln -sfn 2.1.0 2.1``
481+
- Update stable symlink to point to X.Y: ``ln -sfn version/2.1 stable``
482+
483+
- For a patch release (assuming the ``/version/2.1.3/`` docs have been uploaded to the server):
484+
485+
- Update the X.Y symlink to the new X.Y.Z patch version: ``cd version; ln -sfn 2.1.3 2.1``
486+
- (the stable symlink should already be pointing to the correct X.Y version)
474487

475488
2. If releasing a major or minor release, open a PR in our source code to update
476489
``web/pandas/versions.json``, to have the desired versions in the documentation

pandas/core/algorithms.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -751,7 +751,7 @@ def factorize(
751751
array([0, 0, 1])
752752
>>> uniques
753753
['a', 'c']
754-
Categories (3, object): ['a', 'b', 'c']
754+
Categories (3, str): [a, b, c]
755755
756756
Notice that ``'b'`` is in ``uniques.categories``, despite not being
757757
present in ``cat.values``.
@@ -764,7 +764,7 @@ def factorize(
764764
>>> codes
765765
array([0, 0, 1])
766766
>>> uniques
767-
Index(['a', 'c'], dtype='object')
767+
Index(['a', 'c'], dtype='str')
768768
769769
If NaN is in the values, and we want to include NaN in the uniques of the
770770
values, it can be achieved by setting ``use_na_sentinel=False``.

pandas/core/arrays/categorical.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2233,8 +2233,16 @@ def _repr_categories(self) -> list[str]:
22332233
)
22342234
from pandas.io.formats import format as fmt
22352235

2236+
formatter = None
2237+
if self.categories.dtype == "str":
2238+
# the extension array formatter defaults to boxed=True in format_array
2239+
# override here to boxed=False to be consistent with QUOTE_NONNUMERIC
2240+
formatter = cast(ExtensionArray, self.categories._values)._formatter(
2241+
boxed=False
2242+
)
2243+
22362244
format_array = partial(
2237-
fmt.format_array, formatter=None, quoting=QUOTE_NONNUMERIC
2245+
fmt.format_array, formatter=formatter, quoting=QUOTE_NONNUMERIC
22382246
)
22392247
if len(self.categories) > max_categories:
22402248
num = max_categories // 2

pandas/core/base.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -323,12 +323,12 @@ def transpose(self, *args, **kwargs) -> Self:
323323
0 Ant
324324
1 Bear
325325
2 Cow
326-
dtype: object
326+
dtype: str
327327
>>> s.T
328328
0 Ant
329329
1 Bear
330330
2 Cow
331-
dtype: object
331+
dtype: str
332332
333333
For Index:
334334
@@ -383,7 +383,7 @@ def ndim(self) -> int:
383383
0 Ant
384384
1 Bear
385385
2 Cow
386-
dtype: object
386+
dtype: str
387387
>>> s.ndim
388388
1
389389
@@ -452,9 +452,9 @@ def nbytes(self) -> int:
452452
0 Ant
453453
1 Bear
454454
2 Cow
455-
dtype: object
455+
dtype: str
456456
>>> s.nbytes
457-
24
457+
34
458458
459459
For Index:
460460
@@ -487,7 +487,7 @@ def size(self) -> int:
487487
0 Ant
488488
1 Bear
489489
2 Cow
490-
dtype: object
490+
dtype: str
491491
>>> s.size
492492
3
493493
@@ -567,7 +567,7 @@ def array(self) -> ExtensionArray:
567567
>>> ser = pd.Series(pd.Categorical(["a", "b", "a"]))
568568
>>> ser.array
569569
['a', 'b', 'a']
570-
Categories (2, object): ['a', 'b']
570+
Categories (2, str): [a, b]
571571
"""
572572
raise AbstractMethodError(self)
573573

@@ -1076,15 +1076,15 @@ def value_counts(
10761076
10771077
>>> df.dtypes
10781078
a category
1079-
b object
1079+
b str
10801080
c category
10811081
d category
10821082
dtype: object
10831083
10841084
>>> df.dtypes.value_counts()
10851085
category 2
10861086
category 1
1087-
object 1
1087+
str 1
10881088
Name: count, dtype: int64
10891089
"""
10901090
return algorithms.value_counts_internal(
@@ -1386,7 +1386,7 @@ def factorize(
13861386
... )
13871387
>>> ser
13881388
['apple', 'bread', 'bread', 'cheese', 'milk']
1389-
Categories (4, object): ['apple' < 'bread' < 'cheese' < 'milk']
1389+
Categories (4, str): [apple < bread < cheese < milk]
13901390
13911391
>>> ser.searchsorted('bread')
13921392
np.int64(1)

0 commit comments

Comments
 (0)