Skip to content

Commit d3be415

Browse files
authored
Merge branch 'main' into bugfix-spss-kwargs
2 parents d0b6abc + e158765 commit d3be415

File tree

25 files changed

+230
-121
lines changed

25 files changed

+230
-121
lines changed

ci/code_checks.sh

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
8080
pandas.io.formats.style.Styler.to_latex \
8181
pandas.read_parquet \
8282
pandas.DataFrame.to_sql \
83-
pandas.io.formats.style.Styler.map \
84-
pandas.io.formats.style.Styler.apply_index \
85-
pandas.io.formats.style.Styler.map_index \
86-
pandas.io.formats.style.Styler.format \
8783
RET=$(($RET + $?)) ; echo $MSG "DONE"
8884

8985
fi

ci/deps/actions-39-minimum_versions.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ dependencies:
2222

2323
# required dependencies
2424
- python-dateutil=2.8.2
25-
- numpy=1.22.4
25+
- numpy=1.23.5
2626
- pytz=2020.1
2727

2828
# optional dependencies

doc/source/getting_started/install.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Instructions for installing :ref:`from source <install.source>`,
2121
Python version support
2222
----------------------
2323

24-
Officially Python 3.9, 3.10 and 3.11.
24+
Officially Python 3.9, 3.10, 3.11 and 3.12.
2525

2626
Installing pandas
2727
-----------------
@@ -203,7 +203,7 @@ pandas requires the following dependencies.
203203
================================================================ ==========================
204204
Package Minimum supported version
205205
================================================================ ==========================
206-
`NumPy <https://numpy.org>`__ 1.22.4
206+
`NumPy <https://numpy.org>`__ 1.23.5
207207
`python-dateutil <https://dateutil.readthedocs.io/en/stable/>`__ 2.8.2
208208
`pytz <https://pypi.org/project/pytz/>`__ 2020.1
209209
`tzdata <https://pypi.org/project/tzdata/>`__ 2022.7

doc/source/whatsnew/v2.1.4.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,4 @@ Bug fixes
4242
Contributors
4343
~~~~~~~~~~~~
4444

45-
.. contributors:: v2.1.3..v2.1.4|HEAD
45+
.. contributors:: v2.1.3..v2.1.4

doc/source/whatsnew/v2.2.0.rst

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _whatsnew_220:
22

3-
What's new in 2.2.0 (Month XX, 2024)
4-
------------------------------------
3+
What's new in 2.2.0 (January 19, 2024)
4+
--------------------------------------
55

66
These are the changes in pandas 2.2.0. See :ref:`release` for a full changelog
77
including other versions of pandas.
@@ -436,12 +436,6 @@ index levels when joining on two indexes with different levels (:issue:`34133`).
436436
437437
result
438438
439-
.. ---------------------------------------------------------------------------
440-
.. _whatsnew_220.api_breaking:
441-
442-
Backwards incompatible API changes
443-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
444-
445439
.. _whatsnew_220.api_breaking.deps:
446440

447441
Increased minimum versions for dependencies
@@ -820,7 +814,7 @@ Conversion
820814
- Bug in :meth:`DataFrame.astype` when called with ``str`` on unpickled array - the array might change in-place (:issue:`54654`)
821815
- Bug in :meth:`DataFrame.astype` where ``errors="ignore"`` had no effect for extension types (:issue:`54654`)
822816
- Bug in :meth:`Series.convert_dtypes` not converting all NA column to ``null[pyarrow]`` (:issue:`55346`)
823-
- Bug in ``DataFrame.loc`` was not throwing "incompatible dtype warning" (see `PDEP6 <https://pandas.pydata.org/pdeps/0006-ban-upcasting.html>`_) when assigning a ``Series`` with a different dtype using a full column setter (e.g. ``df.loc[:, 'a'] = incompatible_value``) (:issue:`39584`)
817+
- Bug in :meth:``DataFrame.loc`` was not throwing "incompatible dtype warning" (see `PDEP6 <https://pandas.pydata.org/pdeps/0006-ban-upcasting.html>`_) when assigning a ``Series`` with a different dtype using a full column setter (e.g. ``df.loc[:, 'a'] = incompatible_value``) (:issue:`39584`)
824818

825819
Strings
826820
^^^^^^^
@@ -830,10 +824,10 @@ Strings
830824
- Bug in :meth:`Index.str.cat` always casting result to object dtype (:issue:`56157`)
831825
- Bug in :meth:`Series.__mul__` for :class:`ArrowDtype` with ``pyarrow.string`` dtype and ``string[pyarrow]`` for the pyarrow backend (:issue:`51970`)
832826
- Bug in :meth:`Series.str.find` when ``start < 0`` for :class:`ArrowDtype` with ``pyarrow.string`` (:issue:`56411`)
827+
- Bug in :meth:`Series.str.fullmatch` when ``dtype=pandas.ArrowDtype(pyarrow.string()))`` allows partial matches when regex ends in literal //$ (:issue:`56652`)
833828
- Bug in :meth:`Series.str.replace` when ``n < 0`` for :class:`ArrowDtype` with ``pyarrow.string`` (:issue:`56404`)
834829
- Bug in :meth:`Series.str.startswith` and :meth:`Series.str.endswith` with arguments of type ``tuple[str, ...]`` for :class:`ArrowDtype` with ``pyarrow.string`` dtype (:issue:`56579`)
835830
- Bug in :meth:`Series.str.startswith` and :meth:`Series.str.endswith` with arguments of type ``tuple[str, ...]`` for ``string[pyarrow]`` (:issue:`54942`)
836-
- Bug in :meth:`str.fullmatch` when ``dtype=pandas.ArrowDtype(pyarrow.string()))`` allows partial matches when regex ends in literal //$ (:issue:`56652`)
837831
- Bug in comparison operations for ``dtype="string[pyarrow_numpy]"`` raising if dtypes can't be compared (:issue:`56008`)
838832

839833
Interval
@@ -893,7 +887,6 @@ Plotting
893887

894888
Groupby/resample/rolling
895889
^^^^^^^^^^^^^^^^^^^^^^^^
896-
- Bug in :class:`.Rolling` where duplicate datetimelike indexes are treated as consecutive rather than equal with ``closed='left'`` and ``closed='neither'`` (:issue:`20712`)
897890
- Bug in :meth:`.DataFrameGroupBy.idxmin`, :meth:`.DataFrameGroupBy.idxmax`, :meth:`.SeriesGroupBy.idxmin`, and :meth:`.SeriesGroupBy.idxmax` would not retain :class:`.Categorical` dtype when the index was a :class:`.CategoricalIndex` that contained NA values (:issue:`54234`)
898891
- Bug in :meth:`.DataFrameGroupBy.transform` and :meth:`.SeriesGroupBy.transform` when ``observed=False`` and ``f="idxmin"`` or ``f="idxmax"`` would incorrectly raise on unobserved categories (:issue:`54234`)
899892
- Bug in :meth:`.DataFrameGroupBy.value_counts` and :meth:`.SeriesGroupBy.value_counts` could result in incorrect sorting if the columns of the DataFrame or name of the Series are integers (:issue:`55951`)
@@ -907,6 +900,7 @@ Groupby/resample/rolling
907900
- Bug in :meth:`DataFrame.resample` when resampling on a :class:`ArrowDtype` of ``pyarrow.timestamp`` or ``pyarrow.duration`` type (:issue:`55989`)
908901
- Bug in :meth:`DataFrame.resample` where bin edges were not correct for :class:`~pandas.tseries.offsets.BusinessDay` (:issue:`55281`)
909902
- Bug in :meth:`DataFrame.resample` where bin edges were not correct for :class:`~pandas.tseries.offsets.MonthBegin` (:issue:`55271`)
903+
- Bug in :meth:`DataFrame.rolling` and :meth:`Series.rolling` where duplicate datetimelike indexes are treated as consecutive rather than equal with ``closed='left'`` and ``closed='neither'`` (:issue:`20712`)
910904
- Bug in :meth:`DataFrame.rolling` and :meth:`Series.rolling` where either the ``index`` or ``on`` column was :class:`ArrowDtype` with ``pyarrow.timestamp`` type (:issue:`55849`)
911905

912906
Reshaping
@@ -928,27 +922,29 @@ Reshaping
928922

929923
Sparse
930924
^^^^^^
931-
- Bug in :meth:`SparseArray.take` when using a different fill value than the array's fill value (:issue:`55181`)
925+
- Bug in :meth:`arrays.SparseArray.take` when using a different fill value than the array's fill value (:issue:`55181`)
932926

933927
Other
934928
^^^^^
935929
- :meth:`DataFrame.__dataframe__` did not support pyarrow large strings (:issue:`56702`)
936930
- Bug in :func:`DataFrame.describe` when formatting percentiles in the resulting percentile 99.999% is rounded to 100% (:issue:`55765`)
931+
- Bug in :func:`api.interchange.from_dataframe` where it raised ``NotImplementedError`` when handling empty string columns (:issue:`56703`)
937932
- Bug in :func:`cut` and :func:`qcut` with ``datetime64`` dtype values with non-nanosecond units incorrectly returning nanosecond-unit bins (:issue:`56101`)
938933
- Bug in :func:`cut` incorrectly allowing cutting of timezone-aware datetimes with timezone-naive bins (:issue:`54964`)
939934
- Bug in :func:`infer_freq` and :meth:`DatetimeIndex.inferred_freq` with weekly frequencies and non-nanosecond resolutions (:issue:`55609`)
940-
- Bug in :func:`pd.api.interchange.from_dataframe` where it raised ``NotImplementedError`` when handling empty string columns (:issue:`56703`)
941935
- Bug in :meth:`DataFrame.apply` where passing ``raw=True`` ignored ``args`` passed to the applied function (:issue:`55009`)
942936
- Bug in :meth:`DataFrame.from_dict` which would always sort the rows of the created :class:`DataFrame`. (:issue:`55683`)
943937
- Bug in :meth:`DataFrame.sort_index` when passing ``axis="columns"`` and ``ignore_index=True`` raising a ``ValueError`` (:issue:`56478`)
944938
- Bug in rendering ``inf`` values inside a :class:`DataFrame` with the ``use_inf_as_na`` option enabled (:issue:`55483`)
945939
- Bug in rendering a :class:`Series` with a :class:`MultiIndex` when one of the index level's names is 0 not having that name displayed (:issue:`55415`)
946940
- Bug in the error message when assigning an empty :class:`DataFrame` to a column (:issue:`55956`)
947941
- Bug when time-like strings were being cast to :class:`ArrowDtype` with ``pyarrow.time64`` type (:issue:`56463`)
948-
- Fixed a spurious deprecation warning from ``numba`` >= 0.58.0 when passing a numpy ufunc in :class:`pandas.core.window.Rolling.apply` with ``engine="numba"`` (:issue:`55247`)
942+
- Fixed a spurious deprecation warning from ``numba`` >= 0.58.0 when passing a numpy ufunc in :class:`core.window.Rolling.apply` with ``engine="numba"`` (:issue:`55247`)
949943

950944
.. ---------------------------------------------------------------------------
951945
.. _whatsnew_220.contributors:
952946

953947
Contributors
954948
~~~~~~~~~~~~
949+
950+
.. contributors:: v2.1.4..v2.2.0|HEAD

doc/source/whatsnew/v2.3.0.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ If installed, we now require:
6565
+-----------------+-----------------+----------+---------+
6666
| Package | Minimum Version | Required | Changed |
6767
+=================+=================+==========+=========+
68-
| | | X | X |
68+
| numpy | 1.23.5 | X | X |
6969
+-----------------+-----------------+----------+---------+
7070

7171
For `optional libraries <https://pandas.pydata.org/docs/getting_started/install.html>`_ the general recommendation is to use the latest version.
@@ -103,6 +103,7 @@ Performance improvements
103103
~~~~~~~~~~~~~~~~~~~~~~~~
104104
- Performance improvement in :meth:`DataFrame.join` for sorted but non-unique indexes (:issue:`56941`)
105105
- Performance improvement in :meth:`DataFrame.join` when left and/or right are non-unique and ``how`` is ``"left"``, ``"right"``, or ``"inner"`` (:issue:`56817`)
106+
- Performance improvement in :meth:`DataFrame.join` with ``how="left"`` or ``how="right"`` and ``sort=True`` (:issue:`56919`)
106107
- Performance improvement in :meth:`DataFrameGroupBy.ffill`, :meth:`DataFrameGroupBy.bfill`, :meth:`SeriesGroupBy.ffill`, and :meth:`SeriesGroupBy.bfill` (:issue:`56902`)
107108
- Performance improvement in :meth:`Index.take` when ``indices`` is a full range indexer from zero to length of index (:issue:`56806`)
108109
-
@@ -187,7 +188,7 @@ Plotting
187188

188189
Groupby/resample/rolling
189190
^^^^^^^^^^^^^^^^^^^^^^^^
190-
-
191+
- Bug in :meth:`.DataFrameGroupBy.quantile` when ``interpolation="nearest"`` is inconsistent with :meth:`DataFrame.quantile` (:issue:`47942`)
191192
-
192193

193194
Reshaping

pandas/_libs/groupby.pyx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1286,7 +1286,9 @@ def group_quantile(
12861286
elif interp == INTERPOLATION_MIDPOINT:
12871287
out[i, k] = (val + next_val) / 2.0
12881288
elif interp == INTERPOLATION_NEAREST:
1289-
if frac > .5 or (frac == .5 and q_val > .5): # Always OK?
1289+
if frac > .5 or (frac == .5 and idx % 2 == 1):
1290+
# If quantile lies in the middle of two indexes,
1291+
# take the even index, as np.quantile.
12901292
out[i, k] = next_val
12911293
else:
12921294
out[i, k] = val

pandas/compat/_optional.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -147,9 +147,8 @@ def import_optional_dependency(
147147
The imported module, when found and the version is correct.
148148
None is returned when the package is not found and `errors`
149149
is False, or when the package's version is too old and `errors`
150-
is ``'warn'``.
150+
is ``'warn'`` or ``'ignore'``.
151151
"""
152-
153152
assert errors in {"warn", "raise", "ignore"}
154153

155154
package_name = INSTALL_MAPPING.get(name)
@@ -190,5 +189,7 @@ def import_optional_dependency(
190189
return None
191190
elif errors == "raise":
192191
raise ImportError(msg)
192+
else:
193+
return None
193194

194195
return module

pandas/compat/numpy/__init__.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,12 @@
88
# numpy versioning
99
_np_version = np.__version__
1010
_nlv = Version(_np_version)
11-
np_version_lt1p23 = _nlv < Version("1.23")
1211
np_version_gte1p24 = _nlv >= Version("1.24")
1312
np_version_gte1p24p3 = _nlv >= Version("1.24.3")
1413
np_version_gte1p25 = _nlv >= Version("1.25")
1514
np_version_gt2 = _nlv >= Version("2.0.0.dev0")
1615
is_numpy_dev = _nlv.dev is not None
17-
_min_numpy_ver = "1.22.4"
16+
_min_numpy_ver = "1.23.5"
1817

1918

2019
if _nlv < Version(_min_numpy_ver):

pandas/core/frame.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -988,6 +988,33 @@ def __dataframe_consortium_standard__(
988988
)
989989
return convert_to_standard_compliant_dataframe(self, api_version=api_version)
990990

991+
def __arrow_c_stream__(self, requested_schema=None):
992+
"""
993+
Export the pandas DataFrame as an Arrow C stream PyCapsule.
994+
995+
This relies on pyarrow to convert the pandas DataFrame to the Arrow
996+
format (and follows the default behaviour of ``pyarrow.Table.from_pandas``
997+
in its handling of the index, i.e. store the index as a column except
998+
for RangeIndex).
999+
This conversion is not necessarily zero-copy.
1000+
1001+
Parameters
1002+
----------
1003+
requested_schema : PyCapsule, default None
1004+
The schema to which the dataframe should be casted, passed as a
1005+
PyCapsule containing a C ArrowSchema representation of the
1006+
requested schema.
1007+
1008+
Returns
1009+
-------
1010+
PyCapsule
1011+
"""
1012+
pa = import_optional_dependency("pyarrow", min_version="14.0.0")
1013+
if requested_schema is not None:
1014+
requested_schema = pa.Schema._import_from_c_capsule(requested_schema)
1015+
table = pa.Table.from_pandas(self, schema=requested_schema)
1016+
return table.__arrow_c_stream__()
1017+
9911018
# ----------------------------------------------------------------------
9921019

9931020
@property

0 commit comments

Comments
 (0)