Skip to content

Commit 2b3d8d1

Browse files
authored
Merge branch 'main' into read-csv-from-directory
2 parents ca3f0fc + fcd2a5d commit 2b3d8d1

File tree

89 files changed

+531
-377
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

89 files changed

+531
-377
lines changed

.github/ISSUE_TEMPLATE/bug_report.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ body:
2626
label: Reproducible Example
2727
description: >
2828
Please follow [this guide](https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports) on how to
29-
provide a minimal, copy-pastable example.
29+
provide a minimal, copy-pastable example. Reports without reproducible examples will generally be closed
30+
until they are provided.
3031
placeholder: >
3132
import pandas as pd
3233

.github/ISSUE_TEMPLATE/documentation_improvement.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ body:
2828
attributes:
2929
label: Documentation problem
3030
description: >
31-
Please provide a description of what documentation you believe needs to be fixed/improved
31+
Please provide a description of what documentation you believe needs to be fixed/improved.
32+
Reports without a clear, actionable request will generally be closed.
3233
validations:
3334
required: true
3435
- type: textarea

.github/ISSUE_TEMPLATE/feature_request.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@ body:
2121
attributes:
2222
label: Problem Description
2323
description: >
24-
Please describe what problem the feature would solve, e.g. "I wish I could use pandas to ..."
24+
Please describe what problem the feature would solve, e.g. "I wish I could use pandas to ...".
25+
Reports without a clear, actionable request will generally be closed.
2526
placeholder: >
2627
I wish I could use pandas to return a Series from a DataFrame when possible.
2728
validations:

.github/ISSUE_TEMPLATE/performance_issue.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ body:
2525
description: >
2626
Please provide a minimal, copy-pastable example that quantifies
2727
[slow runtime](https://docs.python.org/3/library/timeit.html) or
28-
[memory](https://pypi.org/project/memory-profiler/) issues.
28+
[memory](https://pypi.org/project/memory-profiler/) issues. Reports
29+
without reproducible examples will generally be closed
30+
until they are provided.
2931
validations:
3032
required: true
3133
- type: textarea

.github/workflows/wheels.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ jobs:
162162
run: echo "sdist_name=$(cd ./dist && ls -d */)" >> "$GITHUB_ENV"
163163

164164
- name: Build wheels
165-
uses: pypa/[email protected].1
165+
uses: pypa/[email protected].3
166166
with:
167167
package-dir: ./dist/${{ startsWith(matrix.buildplat[1], 'macosx') && env.sdist_name || needs.build_sdist.outputs.sdist_file }}
168168
env:

.pre-commit-config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ ci:
1919
skip: [pyright, mypy]
2020
repos:
2121
- repo: https://github.com/astral-sh/ruff-pre-commit
22-
rev: v0.12.2
22+
rev: v0.12.7
2323
hooks:
2424
- id: ruff
2525
args: [--exit-non-zero-on-fix]
@@ -95,14 +95,14 @@ repos:
9595
- id: sphinx-lint
9696
args: ["--enable", "all", "--disable", "line-too-long"]
9797
- repo: https://github.com/pre-commit/mirrors-clang-format
98-
rev: v20.1.7
98+
rev: v20.1.8
9999
hooks:
100100
- id: clang-format
101101
files: ^pandas/_libs/src|^pandas/_libs/include
102102
args: [-i]
103103
types_or: [c, c++]
104104
- repo: https://github.com/trim21/pre-commit-mirror-meson
105-
rev: v1.8.2
105+
rev: v1.8.3
106106
hooks:
107107
- id: meson-fmt
108108
args: ['--inplace']

doc/source/development/contributing_documentation.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,11 @@ If you want to do a full clean build, do::
157157
python make.py clean
158158
python make.py html
159159

160+
.. tip::
161+
If ``python make.py html`` exits with an error status,
162+
try running the command ``python make.py html --num-jobs=1``
163+
to identify the cause of the error.
164+
160165
You can tell ``make.py`` to compile only a single section of the docs, greatly
161166
reducing the turn-around time for checking your changes.
162167

doc/source/whatsnew/v3.0.0.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ Other enhancements
8181
- :meth:`Rolling.agg`, :meth:`Expanding.agg` and :meth:`ExponentialMovingWindow.agg` now accept :class:`NamedAgg` aggregations through ``**kwargs`` (:issue:`28333`)
8282
- :meth:`Series.map` can now accept kwargs to pass on to func (:issue:`59814`)
8383
- :meth:`Series.map` now accepts an ``engine`` parameter to allow execution with a third-party execution engine (:issue:`61125`)
84+
- :meth:`Series.rank` and :meth:`DataFrame.rank` with numpy-nullable dtypes preserve ``NA`` values and return ``UInt64`` dtype where appropriate instead of casting ``NA`` to ``NaN`` with ``float64`` dtype (:issue:`62043`)
8485
- :meth:`Series.str.get_dummies` now accepts a ``dtype`` parameter to specify the dtype of the resulting DataFrame (:issue:`47872`)
8586
- :meth:`pandas.concat` will raise a ``ValueError`` when ``ignore_index=True`` and ``keys`` is not ``None`` (:issue:`59274`)
8687
- :py:class:`frozenset` elements in pandas objects are now natively printed (:issue:`60690`)
@@ -89,13 +90,15 @@ Other enhancements
8990
- Added support to read and write from and to Apache Iceberg tables with the new :func:`read_iceberg` and :meth:`DataFrame.to_iceberg` functions (:issue:`61383`)
9091
- Errors occurring during SQL I/O will now throw a generic :class:`.DatabaseError` instead of the raw Exception type from the underlying driver manager library (:issue:`60748`)
9192
- Implemented :meth:`Series.str.isascii` and :meth:`Series.str.isascii` (:issue:`59091`)
93+
- Improve the resulting dtypes in :meth:`DataFrame.where` and :meth:`DataFrame.mask` with :class:`ExtensionDtype` ``other`` (:issue:`62038`)
9294
- Improved deprecation message for offset aliases (:issue:`60820`)
9395
- Multiplying two :class:`DateOffset` objects will now raise a ``TypeError`` instead of a ``RecursionError`` (:issue:`59442`)
9496
- Restore support for reading Stata 104-format and enable reading 103-format dta files (:issue:`58554`)
9597
- Support passing a :class:`Iterable[Hashable]` input to :meth:`DataFrame.drop_duplicates` (:issue:`59237`)
9698
- Support reading Stata 102-format (Stata 1) dta files (:issue:`58978`)
9799
- Support reading Stata 110-format (Stata 7) dta files (:issue:`47176`)
98100
- Added support for reading from directories in :func:`pandas.read_csv`, including local folders and remote locations via ``fsspec``
101+
-
99102

100103
.. ---------------------------------------------------------------------------
101104
.. _whatsnew_300.notable_bug_fixes:
@@ -416,6 +419,7 @@ Other API changes
416419
an empty ``RangeIndex`` or empty ``Index`` with object dtype when determining
417420
the dtype of the resulting Index (:issue:`60797`)
418421
- :class:`IncompatibleFrequency` now subclasses ``TypeError`` instead of ``ValueError``. As a result, joins with mismatched frequencies now cast to object like other non-comparable joins, and arithmetic with indexes with mismatched frequencies align (:issue:`55782`)
422+
- :meth:`ExtensionDtype.construct_array_type` is now a regular method instead of a ``classmethod`` (:issue:`58663`)
419423
- Comparison operations between :class:`Index` and :class:`Series` now consistently return :class:`Series` regardless of which object is on the left or right (:issue:`36759`)
420424
- Numpy functions like ``np.isinf`` that return a bool dtype when called on a :class:`Index` object now return a bool-dtype :class:`Index` instead of ``np.ndarray`` (:issue:`52676`)
421425

@@ -505,7 +509,7 @@ Renamed the following offset aliases (:issue:`57986`):
505509

506510
Other Removals
507511
^^^^^^^^^^^^^^
508-
- :class:`.DataFrameGroupBy.idxmin`, :class:`.DataFrameGroupBy.idxmax`, :class:`.SeriesGroupBy.idxmin`, and :class:`.SeriesGroupBy.idxmax` will now raise a ``ValueError`` when used with ``skipna=False`` and an NA value is encountered (:issue:`10694`)
512+
- :class:`.DataFrameGroupBy.idxmin`, :class:`.DataFrameGroupBy.idxmax`, :class:`.SeriesGroupBy.idxmin`, and :class:`.SeriesGroupBy.idxmax` will now raise a ``ValueError`` when a group has all NA values, or when used with ``skipna=False`` and any NA value is encountered (:issue:`10694`, :issue:`57745`)
509513
- :func:`concat` no longer ignores empty objects when determining output dtypes (:issue:`39122`)
510514
- :func:`concat` with all-NA entries no longer ignores the dtype of those entries when determining the result dtype (:issue:`40893`)
511515
- :func:`read_excel`, :func:`read_json`, :func:`read_html`, and :func:`read_xml` no longer accept raw string or byte representation of the data. That type of data must be wrapped in a :py:class:`StringIO` or :py:class:`BytesIO` (:issue:`53767`)

pandas/_libs/groupby.pyx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2048,9 +2048,8 @@ def group_idxmin_idxmax(
20482048
group_min_or_max = np.empty_like(out, dtype=values.dtype)
20492049
seen = np.zeros_like(out, dtype=np.uint8)
20502050

2051-
# When using transform, we need a valid value for take in the case
2052-
# a category is not observed; these values will be dropped
2053-
out[:] = 0
2051+
# Sentinel for no valid values.
2052+
out[:] = -1
20542053

20552054
with nogil(numeric_object_t is not object):
20562055
for i in range(N):

pandas/_libs/index.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -803,7 +803,7 @@ cdef class BaseMultiIndexCodesEngine:
803803
int_keys : 1-dimensional array of dtype uint64 or object
804804
Integers representing one combination each
805805
"""
806-
level_codes = list(target._recode_for_new_levels(self.levels))
806+
level_codes = list(target._recode_for_new_levels(self.levels, copy=True))
807807
for i, codes in enumerate(level_codes):
808808
if self.levels[i].hasnans:
809809
na_index = self.levels[i].isna().nonzero()[0][0]

0 commit comments

Comments
 (0)