Skip to content

Commit 4d5e72f

Browse files
authored
Merge branch 'pandas-dev:main' into bug
2 parents d24da1b + ee0902a commit 4d5e72f

File tree

21 files changed

+276
-42
lines changed

21 files changed

+276
-42
lines changed

ci/code_checks.sh

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
7373
-i "pandas.Period.freq GL08" \
7474
-i "pandas.Period.ordinal GL08" \
7575
-i "pandas.RangeIndex.from_range PR01,SA01" \
76-
-i "pandas.Series.dt.freq GL08" \
7776
-i "pandas.Series.dt.unit GL08" \
7877
-i "pandas.Series.pad PR01,SA01" \
7978
-i "pandas.Timedelta.max PR02" \
@@ -92,15 +91,11 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
9291
-i "pandas.core.groupby.DataFrameGroupBy.boxplot PR07,RT03,SA01" \
9392
-i "pandas.core.groupby.DataFrameGroupBy.get_group RT03,SA01" \
9493
-i "pandas.core.groupby.DataFrameGroupBy.indices SA01" \
95-
-i "pandas.core.groupby.DataFrameGroupBy.nth PR02" \
9694
-i "pandas.core.groupby.DataFrameGroupBy.nunique SA01" \
9795
-i "pandas.core.groupby.DataFrameGroupBy.plot PR02" \
9896
-i "pandas.core.groupby.DataFrameGroupBy.sem SA01" \
9997
-i "pandas.core.groupby.SeriesGroupBy.get_group RT03,SA01" \
10098
-i "pandas.core.groupby.SeriesGroupBy.indices SA01" \
101-
-i "pandas.core.groupby.SeriesGroupBy.is_monotonic_decreasing SA01" \
102-
-i "pandas.core.groupby.SeriesGroupBy.is_monotonic_increasing SA01" \
103-
-i "pandas.core.groupby.SeriesGroupBy.nth PR02" \
10499
-i "pandas.core.groupby.SeriesGroupBy.plot PR02" \
105100
-i "pandas.core.groupby.SeriesGroupBy.sem SA01" \
106101
-i "pandas.core.resample.Resampler.get_group RT03,SA01" \
@@ -114,8 +109,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
114109
-i "pandas.core.resample.Resampler.std SA01" \
115110
-i "pandas.core.resample.Resampler.transform PR01,RT03,SA01" \
116111
-i "pandas.core.resample.Resampler.var SA01" \
117-
-i "pandas.errors.AttributeConflictWarning SA01" \
118-
-i "pandas.errors.ChainedAssignmentError SA01" \
119112
-i "pandas.errors.DuplicateLabelError SA01" \
120113
-i "pandas.errors.IntCastingNaNError SA01" \
121114
-i "pandas.errors.InvalidIndexError SA01" \

doc/source/whatsnew/v3.0.0.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ Other enhancements
5454
- :meth:`Series.cummin` and :meth:`Series.cummax` now supports :class:`CategoricalDtype` (:issue:`52335`)
5555
- :meth:`Series.plot` now correctly handle the ``ylabel`` parameter for pie charts, allowing for explicit control over the y-axis label (:issue:`58239`)
5656
- :meth:`DataFrame.plot.scatter` argument ``c`` now accepts a column of strings, where rows with the same string are colored identically (:issue:`16827` and :issue:`16485`)
57+
- :func:`read_parquet` accepts ``to_pandas_kwargs`` which are forwarded to :meth:`pyarrow.Table.to_pandas` which enables passing additional keywords to customize the conversion to pandas, such as ``maps_as_pydicts`` to read the Parquet map data type as python dictionaries (:issue:`56842`)
5758
- :meth:`DataFrameGroupBy.transform`, :meth:`SeriesGroupBy.transform`, :meth:`DataFrameGroupBy.agg`, :meth:`SeriesGroupBy.agg`, :meth:`RollingGroupby.apply`, :meth:`ExpandingGroupby.apply`, :meth:`Rolling.apply`, :meth:`Expanding.apply`, :meth:`DataFrame.apply` with ``engine="numba"`` now supports positional arguments passed as kwargs (:issue:`58995`)
5859
- :meth:`Series.map` can now accept kwargs to pass on to func (:issue:`59814`)
5960
- :meth:`pandas.concat` will raise a ``ValueError`` when ``ignore_index=True`` and ``keys`` is not ``None`` (:issue:`59274`)
@@ -626,6 +627,7 @@ Datetimelike
626627
- Bug in :meth:`Series.dt.microsecond` producing incorrect results for pyarrow backed :class:`Series`. (:issue:`59154`)
627628
- Bug in :meth:`to_datetime` not respecting dayfirst if an uncommon date string was passed. (:issue:`58859`)
628629
- Bug in :meth:`to_datetime` reports incorrect index in case of any failure scenario. (:issue:`58298`)
630+
- Bug in :meth:`to_datetime` wrongly converts when ``arg`` is a ``np.datetime64`` object with unit of ``ps``. (:issue:`60341`)
629631
- Bug in setting scalar values with mismatched resolution into arrays with non-nanosecond ``datetime64``, ``timedelta64`` or :class:`DatetimeTZDtype` incorrectly truncating those scalars (:issue:`56410`)
630632

631633
Timedelta
@@ -688,6 +690,7 @@ I/O
688690
- Bug in :meth:`DataFrame.from_records` where ``columns`` parameter with numpy structured array was not reordering and filtering out the columns (:issue:`59717`)
689691
- Bug in :meth:`DataFrame.to_dict` raises unnecessary ``UserWarning`` when columns are not unique and ``orient='tight'``. (:issue:`58281`)
690692
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
693+
- Bug in :meth:`DataFrame.to_excel` where the :class:`MultiIndex` index with a period level was not a date (:issue:`60099`)
691694
- Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)
692695
- Bug in :meth:`DataFrame.to_stata` when writing more than 32,000 value labels. (:issue:`60107`)
693696
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
@@ -762,7 +765,7 @@ ExtensionArray
762765

763766
Styler
764767
^^^^^^
765-
-
768+
- Bug in :meth:`Styler.to_latex` where styling column headers when combined with a hidden index or hidden index-levels is fixed.
766769

767770
Other
768771
^^^^^

pandas/_libs/src/vendored/numpy/datetime/np_datetime.c

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -660,11 +660,12 @@ void pandas_datetime_to_datetimestruct(npy_datetime dt, NPY_DATETIMEUNIT base,
660660
perday = 24LL * 60 * 60 * 1000 * 1000 * 1000 * 1000;
661661

662662
set_datetimestruct_days(extract_unit(&dt, perday), out);
663-
out->hour = (npy_int32)extract_unit(&dt, 1000LL * 1000 * 1000 * 60 * 60);
664-
out->min = (npy_int32)extract_unit(&dt, 1000LL * 1000 * 1000 * 60);
665-
out->sec = (npy_int32)extract_unit(&dt, 1000LL * 1000 * 1000);
666-
out->us = (npy_int32)extract_unit(&dt, 1000LL);
667-
out->ps = (npy_int32)(dt * 1000);
663+
out->hour =
664+
(npy_int32)extract_unit(&dt, 1000LL * 1000 * 1000 * 1000 * 60 * 60);
665+
out->min = (npy_int32)extract_unit(&dt, 1000LL * 1000 * 1000 * 1000 * 60);
666+
out->sec = (npy_int32)extract_unit(&dt, 1000LL * 1000 * 1000 * 1000);
667+
out->us = (npy_int32)extract_unit(&dt, 1000LL * 1000);
668+
out->ps = (npy_int32)(dt);
668669
break;
669670

670671
case NPY_FR_fs:

pandas/core/frame.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4742,7 +4742,8 @@ def eval(self, expr: str, *, inplace: bool = False, **kwargs) -> Any | None:
47424742
3 4 4 7 8 0
47434743
4 5 2 6 7 3
47444744
4745-
For columns with spaces in their name, you can use backtick quoting.
4745+
For columns with spaces or other disallowed characters in their name, you can
4746+
use backtick quoting.
47464747
47474748
>>> df.eval("B * `C&C`")
47484749
0 100

pandas/core/groupby/generic.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1443,6 +1443,11 @@ def is_monotonic_increasing(self) -> Series:
14431443
-------
14441444
Series
14451445
1446+
See Also
1447+
--------
1448+
SeriesGroupBy.is_monotonic_decreasing : Return whether each group's values
1449+
are monotonically decreasing.
1450+
14461451
Examples
14471452
--------
14481453
>>> s = pd.Series([2, 1, 3, 4], index=["Falcon", "Falcon", "Parrot", "Parrot"])
@@ -1462,6 +1467,11 @@ def is_monotonic_decreasing(self) -> Series:
14621467
-------
14631468
Series
14641469
1470+
See Also
1471+
--------
1472+
SeriesGroupBy.is_monotonic_increasing : Return whether each group's values
1473+
are monotonically increasing.
1474+
14651475
Examples
14661476
--------
14671477
>>> s = pd.Series([2, 1, 3, 4], index=["Falcon", "Falcon", "Parrot", "Parrot"])

pandas/core/groupby/groupby.py

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3983,19 +3983,6 @@ def nth(self) -> GroupByNthSelector:
39833983
'all' or 'any'; this is equivalent to calling dropna(how=dropna)
39843984
before the groupby.
39853985
3986-
Parameters
3987-
----------
3988-
n : int, slice or list of ints and slices
3989-
A single nth value for the row or a list of nth values or slices.
3990-
3991-
.. versionchanged:: 1.4.0
3992-
Added slice and lists containing slices.
3993-
Added index notation.
3994-
3995-
dropna : {'any', 'all', None}, default None
3996-
Apply the specified dropna operation before counting which row is
3997-
the nth row. Only supported if n is an int.
3998-
39993986
Returns
40003987
-------
40013988
Series or DataFrame

pandas/core/indexes/accessors.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,28 @@ def to_pydatetime(self) -> Series:
373373

374374
@property
375375
def freq(self):
376+
"""
377+
Tries to return a string representing a frequency generated by infer_freq.
378+
379+
Returns None if it can't autodetect the frequency.
380+
381+
See Also
382+
--------
383+
Series.dt.to_period : Cast to PeriodArray/PeriodIndex at a particular
384+
frequency.
385+
386+
Examples
387+
--------
388+
>>> ser = pd.Series(["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-04"])
389+
>>> ser = pd.to_datetime(ser)
390+
>>> ser.dt.freq
391+
'D'
392+
393+
>>> ser = pd.Series(["2022-01-01", "2024-01-01", "2026-01-01", "2028-01-01"])
394+
>>> ser = pd.to_datetime(ser)
395+
>>> ser.dt.freq
396+
'2YS-JAN'
397+
"""
376398
return self._get_values().inferred_freq
377399

378400
def isocalendar(self) -> DataFrame:

pandas/core/series.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -567,7 +567,7 @@ def __arrow_c_stream__(self, requested_schema=None):
567567
Export the pandas Series as an Arrow C stream PyCapsule.
568568
569569
This relies on pyarrow to convert the pandas Series to the Arrow
570-
format (and follows the default behaviour of ``pyarrow.Array.from_pandas``
570+
format (and follows the default behavior of ``pyarrow.Array.from_pandas``
571571
in its handling of the index, i.e. to ignore it).
572572
This conversion is not necessarily zero-copy.
573573
@@ -2226,7 +2226,7 @@ def drop_duplicates(
22262226
5 hippo
22272227
Name: animal, dtype: object
22282228
2229-
With the 'keep' parameter, the selection behaviour of duplicated values
2229+
With the 'keep' parameter, the selection behavior of duplicated values
22302230
can be changed. The value 'first' keeps the first occurrence for each
22312231
set of duplicated entries. The default value of keep is 'first'.
22322232
@@ -3451,7 +3451,7 @@ def sort_values(
34513451
4 5.0
34523452
dtype: float64
34533453
3454-
Sort values ascending order (default behaviour)
3454+
Sort values ascending order (default behavior)
34553455
34563456
>>> s.sort_values(ascending=True)
34573457
1 1.0
@@ -4098,7 +4098,7 @@ def swaplevel(
40984098
40994099
In the following example, we will swap the levels of the indices.
41004100
Here, we will swap the levels column-wise, but levels can be swapped row-wise
4101-
in a similar manner. Note that column-wise is the default behaviour.
4101+
in a similar manner. Note that column-wise is the default behavior.
41024102
By not supplying any arguments for i and j, we swap the last and second to
41034103
last indices.
41044104

pandas/errors/__init__.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -487,6 +487,11 @@ class ChainedAssignmentError(Warning):
487487
For more information on Copy-on-Write,
488488
see :ref:`the user guide<copy_on_write>`.
489489
490+
See Also
491+
--------
492+
options.mode.copy_on_write : Global setting for enabling or disabling
493+
Copy-on-Write behavior.
494+
490495
Examples
491496
--------
492497
>>> pd.options.mode.copy_on_write = True
@@ -672,6 +677,12 @@ class AttributeConflictWarning(Warning):
672677
name than the existing index on an HDFStore or attempting to append an index with a
673678
different frequency than the existing index on an HDFStore.
674679
680+
See Also
681+
--------
682+
HDFStore : Dict-like IO interface for storing pandas objects in PyTables.
683+
DataFrame.to_hdf : Write the contained data to an HDF5 file using HDFStore.
684+
read_hdf : Read from an HDF5 file into a DataFrame.
685+
675686
Examples
676687
--------
677688
>>> idx1 = pd.Index(["a", "b"], name="name1")

pandas/io/_util.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,12 @@ def arrow_table_to_pandas(
6060
table: pyarrow.Table,
6161
dtype_backend: DtypeBackend | Literal["numpy"] | lib.NoDefault = lib.no_default,
6262
null_to_int64: bool = False,
63+
to_pandas_kwargs: dict | None = None,
6364
) -> pd.DataFrame:
6465
pa = import_optional_dependency("pyarrow")
6566

67+
to_pandas_kwargs = {} if to_pandas_kwargs is None else to_pandas_kwargs
68+
6669
types_mapper: type[pd.ArrowDtype] | None | Callable
6770
if dtype_backend == "numpy_nullable":
6871
mapping = _arrow_dtype_mapping()
@@ -80,5 +83,5 @@ def arrow_table_to_pandas(
8083
else:
8184
raise NotImplementedError
8285

83-
df = table.to_pandas(types_mapper=types_mapper)
86+
df = table.to_pandas(types_mapper=types_mapper, **to_pandas_kwargs)
8487
return df

0 commit comments

Comments
 (0)