Skip to content

Commit 8677f0d

Browse files
committed
Merge branch 'main' into fix-loc-dtype
2 parents 74eb356 + b69a2ae commit 8677f0d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+825
-397
lines changed

.github/CODEOWNERS

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,8 @@ doc/source/development @noatamir
1010

1111
# pandas
1212
pandas/_libs/ @WillAyd
13-
pandas/_libs/tslibs/* @MarcoGorelli
1413
pandas/_typing.py @Dr-Irv
1514
pandas/core/groupby/* @rhshadrach
16-
pandas/core/tools/datetimes.py @MarcoGorelli
1715
pandas/io/excel/* @rhshadrach
1816
pandas/io/formats/style.py @attack68
1917
pandas/io/formats/style_render.py @attack68

.github/workflows/wheels.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ jobs:
153153
run: echo "sdist_name=$(cd ./dist && ls -d */)" >> "$GITHUB_ENV"
154154

155155
- name: Build wheels
156-
uses: pypa/[email protected].0
156+
uses: pypa/[email protected].1
157157
with:
158158
package-dir: ./dist/${{ startsWith(matrix.buildplat[1], 'macosx') && env.sdist_name || needs.build_sdist.outputs.sdist_file }}
159159
env:

ci/code_checks.sh

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,13 +72,9 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
7272
-i "pandas.Series.dt PR01" `# Accessors are implemented as classes, but we do not document the Parameters section` \
7373
-i "pandas.Period.freq GL08" \
7474
-i "pandas.Period.ordinal GL08" \
75-
-i "pandas.Timedelta.max PR02" \
76-
-i "pandas.Timedelta.min PR02" \
77-
-i "pandas.Timedelta.resolution PR02" \
7875
-i "pandas.Timestamp.max PR02" \
7976
-i "pandas.Timestamp.min PR02" \
8077
-i "pandas.Timestamp.resolution PR02" \
81-
-i "pandas.Timestamp.tzinfo GL08" \
8278
-i "pandas.core.groupby.DataFrameGroupBy.plot PR02" \
8379
-i "pandas.core.groupby.SeriesGroupBy.plot PR02" \
8480
-i "pandas.core.resample.Resampler.quantile PR01,PR07" \

ci/meta.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,4 +89,3 @@ extra:
8989
- datapythonista
9090
- phofl
9191
- lithomas1
92-
- marcogorelli

doc/source/development/contributing_codebase.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -537,7 +537,7 @@ Preferred ``pytest`` idioms
537537
test and does not check if the test will fail. If this is the behavior you desire, use ``pytest.skip`` instead.
538538

539539
If a test is known to fail but the manner in which it fails
540-
is not meant to be captured, use ``pytest.mark.xfail`` It is common to use this method for a test that
540+
is not meant to be captured, use ``pytest.mark.xfail``. It is common to use this method for a test that
541541
exhibits buggy behavior or a non-implemented feature. If
542542
the failing test has flaky behavior, use the argument ``strict=False``. This
543543
will make it so pytest does not fail if the test happens to pass. Using ``strict=False`` is highly undesirable, please use it only as a last resort.

doc/source/development/debugging_extensions.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ By default building pandas from source will generate a release build. To generat
2323

2424
.. note::
2525

26-
conda environments update CFLAGS/CPPFLAGS with flags that are geared towards generating releases. If using conda, you may need to set ``CFLAGS="$CFLAGS -O0"`` and ``CPPFLAGS="$CPPFLAGS -O0"`` to ensure optimizations are turned off for debugging
26+
conda environments update CFLAGS/CPPFLAGS with flags that are geared towards generating releases, and may work counter towards usage in a development environment. If using conda, you should unset these environment variables via ``export CFLAGS=`` and ``export CPPFLAGS=``
2727

2828
By specifying ``builddir="debug"`` all of the targets will be built and placed in the debug directory relative to the project root. This helps to keep your debug and release artifacts separate; you are of course able to choose a different directory name or omit altogether if you do not care to separate build types.
2929

doc/source/user_guide/basics.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ of elements to display is five, but you may pass a custom number.
3636
Attributes and underlying data
3737
------------------------------
3838

39-
pandas objects have a number of attributes enabling you to access the metadata
39+
pandas objects have a number of attributes enabling you to access the metadata.
4040

4141
* **shape**: gives the axis dimensions of the object, consistent with ndarray
4242
* Axis labels
@@ -59,7 +59,7 @@ NumPy's type system to add support for custom arrays
5959
(see :ref:`basics.dtypes`).
6060

6161
To get the actual data inside a :class:`Index` or :class:`Series`, use
62-
the ``.array`` property
62+
the ``.array`` property.
6363

6464
.. ipython:: python
6565
@@ -88,18 +88,18 @@ NumPy doesn't have a dtype to represent timezone-aware datetimes, so there
8888
are two possibly useful representations:
8989

9090
1. An object-dtype :class:`numpy.ndarray` with :class:`Timestamp` objects, each
91-
with the correct ``tz``
91+
with the correct ``tz``.
9292
2. A ``datetime64[ns]`` -dtype :class:`numpy.ndarray`, where the values have
93-
been converted to UTC and the timezone discarded
93+
been converted to UTC and the timezone discarded.
9494

95-
Timezones may be preserved with ``dtype=object``
95+
Timezones may be preserved with ``dtype=object``:
9696

9797
.. ipython:: python
9898
9999
ser = pd.Series(pd.date_range("2000", periods=2, tz="CET"))
100100
ser.to_numpy(dtype=object)
101101
102-
Or thrown away with ``dtype='datetime64[ns]'``
102+
Or thrown away with ``dtype='datetime64[ns]'``:
103103

104104
.. ipython:: python
105105

doc/source/whatsnew/v3.0.0.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ Other enhancements
6767
- :class:`Rolling` and :class:`Expanding` now support aggregations ``first`` and ``last`` (:issue:`33155`)
6868
- :func:`read_parquet` accepts ``to_pandas_kwargs`` which are forwarded to :meth:`pyarrow.Table.to_pandas` which enables passing additional keywords to customize the conversion to pandas, such as ``maps_as_pydicts`` to read the Parquet map data type as python dictionaries (:issue:`56842`)
6969
- :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
70+
- :meth:`DataFrame.apply` supports using third-party execution engines like the Bodo.ai JIT compiler (:issue:`60668`)
7071
- :meth:`DataFrameGroupBy.transform`, :meth:`SeriesGroupBy.transform`, :meth:`DataFrameGroupBy.agg`, :meth:`SeriesGroupBy.agg`, :meth:`RollingGroupby.apply`, :meth:`ExpandingGroupby.apply`, :meth:`Rolling.apply`, :meth:`Expanding.apply`, :meth:`DataFrame.apply` with ``engine="numba"`` now supports positional arguments passed as kwargs (:issue:`58995`)
7172
- :meth:`Rolling.agg`, :meth:`Expanding.agg` and :meth:`ExponentialMovingWindow.agg` now accept :class:`NamedAgg` aggregations through ``**kwargs`` (:issue:`28333`)
7273
- :meth:`Series.map` can now accept kwargs to pass on to func (:issue:`59814`)
@@ -635,6 +636,7 @@ Bug fixes
635636
Categorical
636637
^^^^^^^^^^^
637638
- Bug in :func:`Series.apply` where ``nan`` was ignored for :class:`CategoricalDtype` (:issue:`59938`)
639+
- Bug in :meth:`Series.convert_dtypes` with ``dtype_backend="pyarrow"`` where empty :class:`CategoricalDtype` :class:`Series` raised an error or got converted to ``null[pyarrow]`` (:issue:`59934`)
638640
-
639641

640642
Datetimelike
@@ -671,6 +673,7 @@ Timezones
671673

672674
Numeric
673675
^^^^^^^
676+
- Bug in :meth:`DataFrame.corr` where numerical precision errors resulted in correlations above ``1.0`` (:issue:`61120`)
674677
- Bug in :meth:`DataFrame.quantile` where the column type was not preserved when ``numeric_only=True`` with a list-like ``q`` produced an empty result (:issue:`59035`)
675678
- Bug in ``np.matmul`` with :class:`Index` inputs raising a ``TypeError`` (:issue:`57079`)
676679

@@ -702,6 +705,7 @@ Indexing
702705
- Bug in :meth:`Index.get_indexer` and similar methods when ``NaN`` is located at or after position 128 (:issue:`58924`)
703706
- Bug in :meth:`MultiIndex.insert` when a new value inserted to a datetime-like level gets cast to ``NaT`` and fails indexing (:issue:`60388`)
704707
- Bug in printing :attr:`Index.names` and :attr:`MultiIndex.levels` would not escape single quotes (:issue:`60190`)
708+
- Bug in reindexing of :class:`DataFrame` with :class:`PeriodDtype` columns in case of consolidated block (:issue:`60980`, :issue:`60273`)
705709

706710
Missing
707711
^^^^^^^
@@ -738,6 +742,7 @@ I/O
738742
- Bug in :meth:`read_csv` where the order of the ``na_values`` makes an inconsistency when ``na_values`` is a list non-string values. (:issue:`59303`)
739743
- Bug in :meth:`read_excel` raising ``ValueError`` when passing array of boolean values when ``dtype="boolean"``. (:issue:`58159`)
740744
- Bug in :meth:`read_html` where ``rowspan`` in header row causes incorrect conversion to ``DataFrame``. (:issue:`60210`)
745+
- Bug in :meth:`read_json` ignoring the given ``dtype`` when ``engine="pyarrow"`` (:issue:`59516`)
741746
- Bug in :meth:`read_json` not validating the ``typ`` argument to not be exactly ``"frame"`` or ``"series"`` (:issue:`59124`)
742747
- Bug in :meth:`read_json` where extreme value integers in string format were incorrectly parsed as a different integer number (:issue:`20608`)
743748
- Bug in :meth:`read_stata` raising ``KeyError`` when input file is stored in big-endian format and contains strL data. (:issue:`58638`)
@@ -768,6 +773,7 @@ Groupby/resample/rolling
768773
- Bug in :meth:`.DataFrameGroupBy.quantile` when ``interpolation="nearest"`` is inconsistent with :meth:`DataFrame.quantile` (:issue:`47942`)
769774
- Bug in :meth:`.Resampler.interpolate` on a :class:`DataFrame` with non-uniform sampling and/or indices not aligning with the resulting resampled index would result in wrong interpolation (:issue:`21351`)
770775
- Bug in :meth:`DataFrame.ewm` and :meth:`Series.ewm` when passed ``times`` and aggregation functions other than mean (:issue:`51695`)
776+
- Bug in :meth:`DataFrame.resample` changing index type to :class:`MultiIndex` when the dataframe is empty and using an upsample method (:issue:`55572`)
771777
- Bug in :meth:`DataFrameGroupBy.agg` that raises ``AttributeError`` when there is dictionary input and duplicated columns, instead of returning a DataFrame with the aggregation of all duplicate columns. (:issue:`55041`)
772778
- Bug in :meth:`DataFrameGroupBy.apply` and :meth:`SeriesGroupBy.apply` for empty data frame with ``group_keys=False`` still creating output index using group keys. (:issue:`60471`)
773779
- Bug in :meth:`DataFrameGroupBy.apply` that was returning a completely empty DataFrame when all return values of ``func`` were ``None`` instead of returning an empty DataFrame with the original columns and dtypes. (:issue:`57775`)
@@ -834,9 +840,11 @@ Other
834840
- Bug in :meth:`DataFrame.where` where using a non-bool type array in the function would return a ``ValueError`` instead of a ``TypeError`` (:issue:`56330`)
835841
- Bug in :meth:`Index.sort_values` when passing a key function that turns values into tuples, e.g. ``key=natsort.natsort_key``, would raise ``TypeError`` (:issue:`56081`)
836842
- Bug in :meth:`MultiIndex.fillna` error message was referring to ``isna`` instead of ``fillna`` (:issue:`60974`)
843+
- Bug in :meth:`Series.describe` where median percentile was always included when the ``percentiles`` argument was passed (:issue:`60550`).
837844
- Bug in :meth:`Series.diff` allowing non-integer values for the ``periods`` argument. (:issue:`56607`)
838845
- Bug in :meth:`Series.dt` methods in :class:`ArrowDtype` that were returning incorrect values. (:issue:`57355`)
839846
- Bug in :meth:`Series.isin` raising ``TypeError`` when series is large (>10**6) and ``values`` contains NA (:issue:`60678`)
847+
- Bug in :meth:`Series.mode` where an exception was raised when taking the mode with nullable types with no null values in the series. (:issue:`58926`)
840848
- Bug in :meth:`Series.rank` that doesn't preserve missing values for nullable integers when ``na_option='keep'``. (:issue:`56976`)
841849
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` inconsistently replacing matching instances when ``regex=True`` and missing values are present. (:issue:`56599`)
842850
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` throwing ``ValueError`` when ``regex=True`` and all NA values. (:issue:`60688`)

pandas/_libs/algos.pyx

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -353,10 +353,9 @@ def nancorr(const float64_t[:, :] mat, bint cov=False, minp=None):
353353
float64_t[:, ::1] result
354354
uint8_t[:, :] mask
355355
int64_t nobs = 0
356-
float64_t vx, vy, dx, dy, meanx, meany, divisor, ssqdmx, ssqdmy, covxy
356+
float64_t vx, vy, dx, dy, meanx, meany, divisor, ssqdmx, ssqdmy, covxy, val
357357

358358
N, K = (<object>mat).shape
359-
360359
if minp is None:
361360
minpv = 1
362361
else:
@@ -389,8 +388,14 @@ def nancorr(const float64_t[:, :] mat, bint cov=False, minp=None):
389388
else:
390389
divisor = (nobs - 1.0) if cov else sqrt(ssqdmx * ssqdmy)
391390

391+
# clip `covxy / divisor` to ensure coeff is within bounds
392392
if divisor != 0:
393-
result[xi, yi] = result[yi, xi] = covxy / divisor
393+
val = covxy / divisor
394+
if val > 1.0:
395+
val = 1.0
396+
elif val < -1.0:
397+
val = -1.0
398+
result[xi, yi] = result[yi, xi] = val
394399
else:
395400
result[xi, yi] = result[yi, xi] = NaN
396401

pandas/_libs/hashtable_func_helper.pxi.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -430,7 +430,7 @@ def mode(ndarray[htfunc_t] values, bint dropna, const uint8_t[:] mask=None):
430430

431431
if na_counter > 0:
432432
res_mask = np.zeros(j+1, dtype=np.bool_)
433-
res_mask[j] = True
433+
res_mask[j] = (na_counter == max_count)
434434
return modes[:j + 1], res_mask
435435

436436

0 commit comments

Comments
 (0)