Skip to content

Commit 1337ccb

Browse files
committed
Merge branch 'main' of https://github.com/pandas-dev/pandas into bug_groupby_all_na_min_max
2 parents ba0fba4 + b64f438 commit 1337ccb

File tree

16 files changed

+64
-213
lines changed

16 files changed

+64
-213
lines changed

.github/CODEOWNERS

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,8 @@ doc/source/development @noatamir
1010

1111
# pandas
1212
pandas/_libs/ @WillAyd
13-
pandas/_libs/tslibs/* @MarcoGorelli
1413
pandas/_typing.py @Dr-Irv
1514
pandas/core/groupby/* @rhshadrach
16-
pandas/core/tools/datetimes.py @MarcoGorelli
1715
pandas/io/excel/* @rhshadrach
1816
pandas/io/formats/style.py @attack68
1917
pandas/io/formats/style_render.py @attack68

ci/meta.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,4 +89,3 @@ extra:
8989
- datapythonista
9090
- phofl
9191
- lithomas1
92-
- marcogorelli

doc/source/user_guide/basics.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ of elements to display is five, but you may pass a custom number.
3636
Attributes and underlying data
3737
------------------------------
3838

39-
pandas objects have a number of attributes enabling you to access the metadata
39+
pandas objects have a number of attributes enabling you to access the metadata.
4040

4141
* **shape**: gives the axis dimensions of the object, consistent with ndarray
4242
* Axis labels
@@ -59,7 +59,7 @@ NumPy's type system to add support for custom arrays
5959
(see :ref:`basics.dtypes`).
6060

6161
To get the actual data inside a :class:`Index` or :class:`Series`, use
62-
the ``.array`` property
62+
the ``.array`` property.
6363

6464
.. ipython:: python
6565
@@ -88,18 +88,18 @@ NumPy doesn't have a dtype to represent timezone-aware datetimes, so there
8888
are two possibly useful representations:
8989

9090
1. An object-dtype :class:`numpy.ndarray` with :class:`Timestamp` objects, each
91-
with the correct ``tz``
91+
with the correct ``tz``.
9292
2. A ``datetime64[ns]`` -dtype :class:`numpy.ndarray`, where the values have
93-
been converted to UTC and the timezone discarded
93+
been converted to UTC and the timezone discarded.
9494

95-
Timezones may be preserved with ``dtype=object``
95+
Timezones may be preserved with ``dtype=object``:
9696

9797
.. ipython:: python
9898
9999
ser = pd.Series(pd.date_range("2000", periods=2, tz="CET"))
100100
ser.to_numpy(dtype=object)
101101
102-
Or thrown away with ``dtype='datetime64[ns]'``
102+
Or thrown away with ``dtype='datetime64[ns]'``:
103103

104104
.. ipython:: python
105105

doc/source/whatsnew/v3.0.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -673,6 +673,7 @@ Timezones
673673

674674
Numeric
675675
^^^^^^^
676+
- Bug in :meth:`DataFrame.corr` where numerical precision errors resulted in correlations above ``1.0`` (:issue:`61120`)
676677
- Bug in :meth:`DataFrame.quantile` where the column type was not preserved when ``numeric_only=True`` with a list-like ``q`` produced an empty result (:issue:`59035`)
677678
- Bug in ``np.matmul`` with :class:`Index` inputs raising a ``TypeError`` (:issue:`57079`)
678679

@@ -772,6 +773,7 @@ Groupby/resample/rolling
772773
- Bug in :meth:`.DataFrameGroupBy.quantile` when ``interpolation="nearest"`` is inconsistent with :meth:`DataFrame.quantile` (:issue:`47942`)
773774
- Bug in :meth:`.Resampler.interpolate` on a :class:`DataFrame` with non-uniform sampling and/or indices not aligning with the resulting resampled index would result in wrong interpolation (:issue:`21351`)
774775
- Bug in :meth:`DataFrame.ewm` and :meth:`Series.ewm` when passed ``times`` and aggregation functions other than mean (:issue:`51695`)
776+
- Bug in :meth:`DataFrame.resample` changing index type to :class:`MultiIndex` when the dataframe is empty and using an upsample method (:issue:`55572`)
775777
- Bug in :meth:`DataFrameGroupBy.agg` that raises ``AttributeError`` when there is dictionary input and duplicated columns, instead of returning a DataFrame with the aggregation of all duplicate columns. (:issue:`55041`)
776778
- Bug in :meth:`DataFrameGroupBy.apply` and :meth:`SeriesGroupBy.apply` for empty data frame with ``group_keys=False`` still creating output index using group keys. (:issue:`60471`)
777779
- Bug in :meth:`DataFrameGroupBy.apply` that was returning a completely empty DataFrame when all return values of ``func`` were ``None`` instead of returning an empty DataFrame with the original columns and dtypes. (:issue:`57775`)

pandas/_libs/algos.pyx

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -353,10 +353,9 @@ def nancorr(const float64_t[:, :] mat, bint cov=False, minp=None):
353353
float64_t[:, ::1] result
354354
uint8_t[:, :] mask
355355
int64_t nobs = 0
356-
float64_t vx, vy, dx, dy, meanx, meany, divisor, ssqdmx, ssqdmy, covxy
356+
float64_t vx, vy, dx, dy, meanx, meany, divisor, ssqdmx, ssqdmy, covxy, val
357357

358358
N, K = (<object>mat).shape
359-
360359
if minp is None:
361360
minpv = 1
362361
else:
@@ -389,8 +388,14 @@ def nancorr(const float64_t[:, :] mat, bint cov=False, minp=None):
389388
else:
390389
divisor = (nobs - 1.0) if cov else sqrt(ssqdmx * ssqdmy)
391390

391+
# clip `covxy / divisor` to ensure coeff is within bounds
392392
if divisor != 0:
393-
result[xi, yi] = result[yi, xi] = covxy / divisor
393+
val = covxy / divisor
394+
if val > 1.0:
395+
val = 1.0
396+
elif val < -1.0:
397+
val = -1.0
398+
result[xi, yi] = result[yi, xi] = val
394399
else:
395400
result[xi, yi] = result[yi, xi] = NaN
396401

pandas/_libs/tslibs/period.pyx

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1752,9 +1752,6 @@ cdef class _Period(PeriodMixin):
17521752
def __cinit__(self, int64_t ordinal, BaseOffset freq):
17531753
self.ordinal = ordinal
17541754
self.freq = freq
1755-
# Note: this is more performant than PeriodDtype.from_date_offset(freq)
1756-
# because from_date_offset cannot be made a cdef method (until cython
1757-
# supported cdef classmethods)
17581755
self._dtype = PeriodDtypeBase(freq._period_dtype_code, freq.n)
17591756

17601757
@classmethod
@@ -1913,7 +1910,7 @@ cdef class _Period(PeriodMixin):
19131910

19141911
Parameters
19151912
----------
1916-
freq : str, BaseOffset
1913+
freq : str, DateOffset
19171914
The target frequency to convert the Period object to.
19181915
If a string is provided,
19191916
it must be a valid :ref:`period alias <timeseries.period_aliases>`.
@@ -2599,7 +2596,7 @@ cdef class _Period(PeriodMixin):
25992596
26002597
Parameters
26012598
----------
2602-
freq : str, BaseOffset
2599+
freq : str, DateOffset
26032600
Frequency to use for the returned period.
26042601
26052602
See Also

pandas/core/resample.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -507,22 +507,12 @@ def _wrap_result(self, result):
507507
"""
508508
Potentially wrap any results.
509509
"""
510-
# GH 47705
511-
obj = self.obj
512-
if (
513-
isinstance(result, ABCDataFrame)
514-
and len(result) == 0
515-
and not isinstance(result.index, PeriodIndex)
516-
):
517-
result = result.set_index(
518-
_asfreq_compat(obj.index[:0], freq=self.freq), append=True
519-
)
520-
521510
if isinstance(result, ABCSeries) and self._selection is not None:
522511
result.name = self._selection
523512

524513
if isinstance(result, ABCSeries) and result.empty:
525514
# When index is all NaT, result is empty but index is not
515+
obj = self.obj
526516
result.index = _asfreq_compat(obj.index[:0], freq=self.freq)
527517
result.name = getattr(obj, "name", None)
528518

@@ -1756,6 +1746,17 @@ def func(x):
17561746
return x.apply(f, *args, **kwargs)
17571747

17581748
result = self._groupby.apply(func)
1749+
1750+
# GH 47705
1751+
if (
1752+
isinstance(result, ABCDataFrame)
1753+
and len(result) == 0
1754+
and not isinstance(result.index, PeriodIndex)
1755+
):
1756+
result = result.set_index(
1757+
_asfreq_compat(self.obj.index[:0], freq=self.freq), append=True
1758+
)
1759+
17591760
return self._wrap_result(result)
17601761

17611762
_upsample = _apply

pandas/tests/frame/methods/test_cov_corr.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,3 +485,15 @@ def test_corrwith_min_periods_boolean(self):
485485
result = df_bool.corrwith(ser_bool, min_periods=3)
486486
expected = Series([0.57735, 0.57735], index=["A", "B"])
487487
tm.assert_series_equal(result, expected)
488+
489+
def test_corr_within_bounds(self):
490+
df1 = DataFrame({"x": [0, 1], "y": [1.35951, 1.3595100000000007]})
491+
result1 = df1.corr().max().max()
492+
expected1 = 1.0
493+
tm.assert_equal(result1, expected1)
494+
495+
rng = np.random.default_rng(seed=42)
496+
df2 = DataFrame(rng.random((100, 4)))
497+
corr_matrix = df2.corr()
498+
assert corr_matrix.min().min() >= -1.0
499+
assert corr_matrix.max().max() <= 1.0

pandas/tests/resample/test_base.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -438,6 +438,24 @@ def test_resample_size_empty_dataframe(freq, index):
438438
tm.assert_series_equal(result, expected)
439439

440440

441+
@pytest.mark.parametrize("index", [DatetimeIndex([]), TimedeltaIndex([])])
442+
@pytest.mark.parametrize("freq", ["D", "h"])
443+
@pytest.mark.parametrize(
444+
"method", ["ffill", "bfill", "nearest", "asfreq", "interpolate", "mean"]
445+
)
446+
def test_resample_apply_empty_dataframe(index, freq, method):
447+
# GH#55572
448+
empty_frame_dti = DataFrame(index=index)
449+
450+
rs = empty_frame_dti.resample(freq)
451+
result = rs.apply(getattr(rs, method))
452+
453+
expected_index = _asfreq_compat(empty_frame_dti.index, freq)
454+
expected = DataFrame([], index=expected_index)
455+
456+
tm.assert_frame_equal(result, expected)
457+
458+
441459
@pytest.mark.parametrize(
442460
"index",
443461
[

web/pandas/config.yml

Lines changed: 2 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,6 @@ maintainers:
8484
- alimcmaster1
8585
- bashtage
8686
- Dr-Irv
87-
- MarcoGorelli
8887
- rhshadrach
8988
- phofl
9089
- attack68
@@ -108,6 +107,7 @@ maintainers:
108107
- gfyoung
109108
- mzeitlin11
110109
- twoertwein
110+
- MarcoGorelli
111111
workgroups:
112112
coc:
113113
name: Code of Conduct
@@ -139,53 +139,28 @@ workgroups:
139139
140140
responsibilities: "Share relevant information with the broader community, mainly via our social networks, as well as being the main point of contact between NumFOCUS and the core team."
141141
members:
142-
- Marco Gorelli
142+
- datapythonista
143143
sponsors:
144144
active:
145145
- name: "NumFOCUS"
146146
url: https://numfocus.org/
147147
logo: static/img/partners/numfocus.svg
148148
kind: numfocus
149-
- name: "Two Sigma"
150-
url: https://www.twosigma.com/
151-
logo: static/img/partners/two_sigma.svg
152-
kind: partner
153-
description: "Jeff Reback"
154-
- name: "Voltron Data"
155-
url: https://voltrondata.com/
156-
logo: static/img/partners/voltron_data.svg
157-
kind: partner
158-
description: "Joris Van den Bossche"
159149
- name: "Coiled"
160150
url: https://www.coiled.io
161151
logo: static/img/partners/coiled.svg
162152
kind: partner
163153
description: "Patrick Hoefler"
164-
- name: "Quansight"
165-
url: https://quansight.com/
166-
logo: static/img/partners/quansight_labs.svg
167-
kind: partner
168-
description: "Marco Gorelli"
169154
- name: "Nvidia"
170155
url: https://www.nvidia.com
171156
logo: static/img/partners/nvidia.svg
172157
kind: partner
173158
description: "Matthew Roeschke"
174-
- name: "Intel"
175-
url: https://www.intel.com/
176-
logo: /static/img/partners/intel.svg
177-
kind: partner
178-
description: "Brock Mendel"
179159
- name: "Tidelift"
180160
url: https://tidelift.com
181161
logo: static/img/partners/tidelift.svg
182162
kind: regular
183163
description: "<i>pandas</i> is part of the <a href=\"https://tidelift.com/subscription/pkg/pypi-pandas?utm_source=pypi-pandas&utm_medium=referral&utm_campaign=readme\">Tidelift subscription</a>. You can support pandas by becoming a Tidelift subscriber."
184-
- name: "Chan Zuckerberg Initiative"
185-
url: https://chanzuckerberg.com/
186-
logo: static/img/partners/czi.svg
187-
kind: regular
188-
description: "<i>pandas</i> is funded by the Essential Open Source Software for Science program of the Chan Zuckerberg Initiative. The funding is used for general maintenance, improve extension types, and a efficient string type."
189164
- name: "Bodo"
190165
url: https://www.bodo.ai/
191166
logo: static/img/partners/bodo.svg

0 commit comments

Comments
 (0)