Skip to content

Commit 2ce3d3d

Browse files
committed
Merge branch 'main' into fix-string-frame-methods
2 parents d2d1774 + a68048e commit 2ce3d3d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+966
-446
lines changed

.circleci/config.yml

Lines changed: 0 additions & 155 deletions
This file was deleted.

.gitattributes

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,6 @@ pandas/_version.py export-subst
6161
*.pxi export-ignore
6262

6363
# Ignoring stuff from the top level
64-
.circleci export-ignore
6564
.github export-ignore
6665
asv_bench export-ignore
6766
ci export-ignore

ci/code_checks.sh

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
8282
-i "pandas.core.groupby.DataFrameGroupBy.plot PR02" \
8383
-i "pandas.core.groupby.SeriesGroupBy.plot PR02" \
8484
-i "pandas.core.resample.Resampler.quantile PR01,PR07" \
85-
-i "pandas.core.resample.Resampler.transform PR01,RT03,SA01" \
8685
-i "pandas.tseries.offsets.BDay PR02,SA01" \
8786
-i "pandas.tseries.offsets.BQuarterBegin.is_on_offset GL08" \
8887
-i "pandas.tseries.offsets.BQuarterBegin.n GL08" \
@@ -146,15 +145,13 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
146145
-i "pandas.tseries.offsets.CustomBusinessMonthBegin PR02" \
147146
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.calendar GL08" \
148147
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.holidays GL08" \
149-
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.is_on_offset SA01" \
150148
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.m_offset GL08" \
151149
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.n GL08" \
152150
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.normalize GL08" \
153151
-i "pandas.tseries.offsets.CustomBusinessMonthBegin.weekmask GL08" \
154152
-i "pandas.tseries.offsets.CustomBusinessMonthEnd PR02" \
155153
-i "pandas.tseries.offsets.CustomBusinessMonthEnd.calendar GL08" \
156154
-i "pandas.tseries.offsets.CustomBusinessMonthEnd.holidays GL08" \
157-
-i "pandas.tseries.offsets.CustomBusinessMonthEnd.is_on_offset SA01" \
158155
-i "pandas.tseries.offsets.CustomBusinessMonthEnd.m_offset GL08" \
159156
-i "pandas.tseries.offsets.CustomBusinessMonthEnd.n GL08" \
160157
-i "pandas.tseries.offsets.CustomBusinessMonthEnd.normalize GL08" \
@@ -191,7 +188,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
191188
-i "pandas.tseries.offsets.Hour.is_on_offset GL08" \
192189
-i "pandas.tseries.offsets.Hour.n GL08" \
193190
-i "pandas.tseries.offsets.Hour.normalize GL08" \
194-
-i "pandas.tseries.offsets.LastWeekOfMonth SA01" \
195191
-i "pandas.tseries.offsets.LastWeekOfMonth.is_on_offset GL08" \
196192
-i "pandas.tseries.offsets.LastWeekOfMonth.n GL08" \
197193
-i "pandas.tseries.offsets.LastWeekOfMonth.normalize GL08" \

ci/deps/circle-311-arm64.yaml

Lines changed: 0 additions & 61 deletions
This file was deleted.

doc/source/getting_started/tutorials.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ Various tutorials
112112

113113
* `Wes McKinney's (pandas BDFL) blog <https://wesmckinney.com/archives.html>`_
114114
* `Statistical analysis made easy in Python with SciPy and pandas DataFrames, by Randal Olson <http://www.randalolson.com/2012/08/06/statistical-analysis-made-easy-in-python/>`_
115-
* `Statistical Data Analysis in Python, tutorial videos, by Christopher Fonnesbeck from SciPy 2013 <https://conference.scipy.org/scipy2013/tutorial_detail.php?id=109>`_
115+
* `Statistical Data Analysis in Python, tutorial by Christopher Fonnesbeck from SciPy 2013 <https://github.com/fonnesbeck/statistical-analysis-python-tutorial>`_
116116
* `Financial analysis in Python, by Thomas Wiecki <https://nbviewer.org/github/twiecki/financial-analysis-python-tutorial/blob/master/1.%20Pandas%20Basics.ipynb>`_
117117
* `Intro to pandas data structures, by Greg Reda <http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/>`_
118118
* `Pandas DataFrames Tutorial, by Karlijn Willems <https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python>`_

doc/source/reference/style.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Styler properties
2727
Styler.template_html_style
2828
Styler.template_html_table
2929
Styler.template_latex
30+
Styler.template_typst
3031
Styler.template_string
3132
Styler.loader
3233

@@ -77,6 +78,7 @@ Style export and import
7778

7879
Styler.to_html
7980
Styler.to_latex
81+
Styler.to_typst
8082
Styler.to_excel
8183
Styler.to_string
8284
Styler.export

doc/source/user_guide/gotchas.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -372,5 +372,5 @@ constructors using something similar to the following:
372372
s = pd.Series(newx)
373373
374374
See `the NumPy documentation on byte order
375-
<https://numpy.org/doc/stable/user/basics.byteswapping.html>`__ for more
375+
<https://numpy.org/doc/stable/user/byteswapping.html>`__ for more
376376
details.

doc/source/whatsnew/v2.3.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ Other enhancements
3535
- The semantics for the ``copy`` keyword in ``__array__`` methods (i.e. called
3636
when using ``np.array()`` or ``np.asarray()`` on pandas objects) has been
3737
updated to work correctly with NumPy >= 2 (:issue:`57739`)
38+
- :meth:`Series.str.decode` result now has ``StringDtype`` when ``future.infer_string`` is True (:issue:`60709`)
39+
- :meth:`~Series.to_hdf` and :meth:`~DataFrame.to_hdf` now round-trip with ``StringDtype`` (:issue:`60663`)
3840
- The :meth:`~Series.cumsum`, :meth:`~Series.cummin`, and :meth:`~Series.cummax` reductions are now implemented for ``StringDtype`` columns when backed by PyArrow (:issue:`60633`)
3941
- The :meth:`~Series.sum` reduction is now implemented for ``StringDtype`` columns (:issue:`59853`)
4042

doc/source/whatsnew/v3.0.0.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ Other enhancements
3131
- :class:`pandas.api.typing.FrozenList` is available for typing the outputs of :attr:`MultiIndex.names`, :attr:`MultiIndex.codes` and :attr:`MultiIndex.levels` (:issue:`58237`)
3232
- :class:`pandas.api.typing.SASReader` is available for typing the output of :func:`read_sas` (:issue:`55689`)
3333
- :meth:`pandas.api.interchange.from_dataframe` now uses the `PyCapsule Interface <https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html>`_ if available, only falling back to the Dataframe Interchange Protocol if that fails (:issue:`60739`)
34+
- Added :meth:`.Styler.to_typst` to write Styler objects to file, buffer or string in Typst format (:issue:`57617`)
3435
- :class:`pandas.api.typing.NoDefault` is available for typing ``no_default``
3536
- :func:`DataFrame.to_excel` now raises an ``UserWarning`` when the character count in a cell exceeds Excel's limitation of 32767 characters (:issue:`56954`)
3637
- :func:`pandas.merge` now validates the ``how`` parameter input (merge type) (:issue:`59435`)
@@ -58,15 +59,15 @@ Other enhancements
5859
- :meth:`Series.cummin` and :meth:`Series.cummax` now supports :class:`CategoricalDtype` (:issue:`52335`)
5960
- :meth:`Series.plot` now correctly handle the ``ylabel`` parameter for pie charts, allowing for explicit control over the y-axis label (:issue:`58239`)
6061
- :meth:`DataFrame.plot.scatter` argument ``c`` now accepts a column of strings, where rows with the same string are colored identically (:issue:`16827` and :issue:`16485`)
62+
- :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` methods ``sum``, ``mean``, ``median``, ``prod``, ``min``, ``max``, ``std``, ``var`` and ``sem`` now accept ``skipna`` parameter (:issue:`15675`)
6163
- :class:`Rolling` and :class:`Expanding` now support aggregations ``first`` and ``last`` (:issue:`33155`)
6264
- :func:`read_parquet` accepts ``to_pandas_kwargs`` which are forwarded to :meth:`pyarrow.Table.to_pandas` which enables passing additional keywords to customize the conversion to pandas, such as ``maps_as_pydicts`` to read the Parquet map data type as python dictionaries (:issue:`56842`)
63-
- :meth:`.DataFrameGroupBy.mean`, :meth:`.DataFrameGroupBy.sum`, :meth:`.SeriesGroupBy.mean` and :meth:`.SeriesGroupBy.sum` now accept ``skipna`` parameter (:issue:`15675`)
6465
- :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
6566
- :meth:`DataFrameGroupBy.transform`, :meth:`SeriesGroupBy.transform`, :meth:`DataFrameGroupBy.agg`, :meth:`SeriesGroupBy.agg`, :meth:`RollingGroupby.apply`, :meth:`ExpandingGroupby.apply`, :meth:`Rolling.apply`, :meth:`Expanding.apply`, :meth:`DataFrame.apply` with ``engine="numba"`` now supports positional arguments passed as kwargs (:issue:`58995`)
6667
- :meth:`Rolling.agg`, :meth:`Expanding.agg` and :meth:`ExponentialMovingWindow.agg` now accept :class:`NamedAgg` aggregations through ``**kwargs`` (:issue:`28333`)
6768
- :meth:`Series.map` can now accept kwargs to pass on to func (:issue:`59814`)
69+
- :meth:`Series.str.get_dummies` now accepts a ``dtype`` parameter to specify the dtype of the resulting DataFrame (:issue:`47872`)
6870
- :meth:`pandas.concat` will raise a ``ValueError`` when ``ignore_index=True`` and ``keys`` is not ``None`` (:issue:`59274`)
69-
- :meth:`str.get_dummies` now accepts a ``dtype`` parameter to specify the dtype of the resulting DataFrame (:issue:`47872`)
7071
- Implemented :meth:`Series.str.isascii` and :meth:`Series.str.isascii` (:issue:`59091`)
7172
- Multiplying two :class:`DateOffset` objects will now raise a ``TypeError`` instead of a ``RecursionError`` (:issue:`59442`)
7273
- Restore support for reading Stata 104-format and enable reading 103-format dta files (:issue:`58554`)
@@ -758,6 +759,7 @@ Groupby/resample/rolling
758759
Reshaping
759760
^^^^^^^^^
760761
- Bug in :func:`qcut` where values at the quantile boundaries could be incorrectly assigned (:issue:`59355`)
762+
- Bug in :meth:`DataFrame.combine_first` not preserving the column order (:issue:`60427`)
761763
- Bug in :meth:`DataFrame.join` inconsistently setting result index name (:issue:`55815`)
762764
- Bug in :meth:`DataFrame.join` when a :class:`DataFrame` with a :class:`MultiIndex` would raise an ``AssertionError`` when :attr:`MultiIndex.names` contained ``None``. (:issue:`58721`)
763765
- Bug in :meth:`DataFrame.merge` where merging on a column containing only ``NaN`` values resulted in an out-of-bounds array access (:issue:`59421`)
@@ -804,6 +806,7 @@ Other
804806
- Bug in :meth:`Index.sort_values` when passing a key function that turns values into tuples, e.g. ``key=natsort.natsort_key``, would raise ``TypeError`` (:issue:`56081`)
805807
- Bug in :meth:`Series.diff` allowing non-integer values for the ``periods`` argument. (:issue:`56607`)
806808
- Bug in :meth:`Series.dt` methods in :class:`ArrowDtype` that were returning incorrect values. (:issue:`57355`)
809+
- Bug in :meth:`Series.isin` raising ``TypeError`` when series is large (>10**6) and ``values`` contains NA (:issue:`60678`)
807810
- Bug in :meth:`Series.rank` that doesn't preserve missing values for nullable integers when ``na_option='keep'``. (:issue:`56976`)
808811
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` inconsistently replacing matching instances when ``regex=True`` and missing values are present. (:issue:`56599`)
809812
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` throwing ``ValueError`` when ``regex=True`` and all NA values. (:issue:`60688`)

pandas/_libs/groupby.pyi

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ def group_median_float64(
1313
mask: np.ndarray | None = ...,
1414
result_mask: np.ndarray | None = ...,
1515
is_datetimelike: bool = ..., # bint
16+
skipna: bool = ...,
1617
) -> None: ...
1718
def group_cumprod(
1819
out: np.ndarray, # float64_t[:, ::1]
@@ -76,6 +77,7 @@ def group_prod(
7677
mask: np.ndarray | None,
7778
result_mask: np.ndarray | None = ...,
7879
min_count: int = ...,
80+
skipna: bool = ...,
7981
) -> None: ...
8082
def group_var(
8183
out: np.ndarray, # floating[:, ::1]
@@ -88,6 +90,7 @@ def group_var(
8890
result_mask: np.ndarray | None = ...,
8991
is_datetimelike: bool = ...,
9092
name: str = ...,
93+
skipna: bool = ...,
9194
) -> None: ...
9295
def group_skew(
9396
out: np.ndarray, # float64_t[:, ::1]
@@ -183,6 +186,7 @@ def group_max(
183186
is_datetimelike: bool = ...,
184187
mask: np.ndarray | None = ...,
185188
result_mask: np.ndarray | None = ...,
189+
skipna: bool = ...,
186190
) -> None: ...
187191
def group_min(
188192
out: np.ndarray, # groupby_t[:, ::1]
@@ -193,6 +197,7 @@ def group_min(
193197
is_datetimelike: bool = ...,
194198
mask: np.ndarray | None = ...,
195199
result_mask: np.ndarray | None = ...,
200+
skipna: bool = ...,
196201
) -> None: ...
197202
def group_idxmin_idxmax(
198203
out: npt.NDArray[np.intp],

0 commit comments

Comments
 (0)