Skip to content

Commit cc476fb

Browse files
committed
Merge branch 'main' into daydst2
2 parents e2c6172 + 5411cc4 commit cc476fb

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+871
-514
lines changed

ci/code_checks.sh

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,8 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
7171
-i ES01 `# For now it is ok if docstrings are missing the extended summary` \
7272
-i "pandas.Series.dt PR01" `# Accessors are implemented as classes, but we do not document the Parameters section` \
7373
-i "pandas.DataFrame.max RT03" \
74-
-i "pandas.DataFrame.mean RT03,SA01" \
75-
-i "pandas.DataFrame.median RT03,SA01" \
74+
-i "pandas.DataFrame.mean RT03" \
75+
-i "pandas.DataFrame.median RT03" \
7676
-i "pandas.DataFrame.min RT03" \
7777
-i "pandas.DataFrame.plot PR02" \
7878
-i "pandas.Grouper PR02" \
@@ -471,7 +471,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
471471
-i "pandas.plotting.andrews_curves RT03,SA01" \
472472
-i "pandas.plotting.lag_plot RT03,SA01" \
473473
-i "pandas.plotting.scatter_matrix PR07,SA01" \
474-
-i "pandas.qcut PR07,SA01" \
475474
-i "pandas.set_eng_float_format RT03,SA01" \
476475
-i "pandas.testing.assert_extension_array_equal SA01" \
477476
-i "pandas.tseries.offsets.BDay PR02,SA01" \
@@ -596,7 +595,7 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
596595
-i "pandas.tseries.offsets.Day.freqstr SA01" \
597596
-i "pandas.tseries.offsets.Day.is_on_offset GL08" \
598597
-i "pandas.tseries.offsets.Day.n GL08" \
599-
-i "pandas.tseries.offsets.Day.nanos SA01" \
598+
-i "pandas.tseries.offsets.Day.nanos GL08" \
600599
-i "pandas.tseries.offsets.Day.normalize GL08" \
601600
-i "pandas.tseries.offsets.Day.rule_code GL08" \
602601
-i "pandas.tseries.offsets.Easter PR02" \

doc/source/getting_started/intro_tutorials/09_timeseries.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,7 @@ Aggregate the current hourly time series values to the monthly maximum value in
295295

296296
.. ipython:: python
297297
298-
monthly_max = no_2.resample("ME").max()
298+
monthly_max = no_2.resample("MS").max()
299299
monthly_max
300300
301301
A very powerful method on time series data with a datetime index, is the

doc/source/user_guide/10min.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ truncated for brevity.
101101
Viewing data
102102
------------
103103

104-
See the :ref:`Essentially basics functionality section <basics>`.
104+
See the :ref:`Essential basic functionality section <basics>`.
105105

106106
Use :meth:`DataFrame.head` and :meth:`DataFrame.tail` to view the top and bottom rows of the frame
107107
respectively:

doc/source/user_guide/boolean.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,19 @@ If you would prefer to keep the ``NA`` values you can manually fill them with ``
3737
3838
s[mask.fillna(True)]
3939
40+
If you create a column of ``NA`` values (for example to fill them later)
41+
with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the
42+
new column. The performance on this column will be worse than with
43+
the appropriate type. It's better to use
44+
``df['new_col'] = pd.Series(pd.NA, dtype="boolean")``
45+
(or another ``dtype`` that supports ``NA``).
46+
47+
.. ipython:: python
48+
49+
df = pd.DataFrame()
50+
df['objects'] = pd.NA
51+
df.dtypes
52+
4053
.. _boolean.kleene:
4154

4255
Kleene logical operations

doc/source/user_guide/integer_na.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,19 @@ with the dtype.
8484
In the future, we may provide an option for :class:`Series` to infer a
8585
nullable-integer dtype.
8686

87+
If you create a column of ``NA`` values (for example to fill them later)
88+
with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the
89+
new column. The performance on this column will be worse than with
90+
the appropriate type. It's better to use
91+
``df['new_col'] = pd.Series(pd.NA, dtype="Int64")``
92+
(or another ``dtype`` that supports ``NA``).
93+
94+
.. ipython:: python
95+
96+
df = pd.DataFrame()
97+
df['objects'] = pd.NA
98+
df.dtypes
99+
87100
Operations
88101
----------
89102

doc/source/user_guide/timeseries.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1864,15 +1864,15 @@ to resample based on datetimelike column in the frame, it can passed to the
18641864
),
18651865
)
18661866
df
1867-
df.resample("ME", on="date")[["a"]].sum()
1867+
df.resample("MS", on="date")[["a"]].sum()
18681868
18691869
Similarly, if you instead want to resample by a datetimelike
18701870
level of ``MultiIndex``, its name or location can be passed to the
18711871
``level`` keyword.
18721872

18731873
.. ipython:: python
18741874
1875-
df.resample("ME", level="d")[["a"]].sum()
1875+
df.resample("MS", level="d")[["a"]].sum()
18761876
18771877
.. _timeseries.iterating-label:
18781878

doc/source/whatsnew/v3.0.0.rst

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Other enhancements
4242
- :meth:`DataFrame.corrwith` now accepts ``min_periods`` as optional arguments, as in :meth:`DataFrame.corr` and :meth:`Series.corr` (:issue:`9490`)
4343
- :meth:`DataFrame.cummin`, :meth:`DataFrame.cummax`, :meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods now have a ``numeric_only`` parameter (:issue:`53072`)
4444
- :meth:`DataFrame.fillna` and :meth:`Series.fillna` can now accept ``value=None``; for non-object dtype the corresponding NA value will be used (:issue:`57723`)
45+
- :meth:`DataFrame.pivot_table` and :func:`pivot_table` now allow the passing of keyword arguments to ``aggfunc`` through ``**kwargs`` (:issue:`57884`)
4546
- :meth:`Series.cummin` and :meth:`Series.cummax` now supports :class:`CategoricalDtype` (:issue:`52335`)
4647
- :meth:`Series.plot` now correctly handle the ``ylabel`` parameter for pie charts, allowing for explicit control over the y-axis label (:issue:`58239`)
4748
- Restore support for reading Stata 104-format and enable reading 103-format dta files (:issue:`58554`)
@@ -279,6 +280,34 @@ Other Deprecations
279280

280281
Removal of prior version deprecations/changes
281282
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
283+
284+
Enforced deprecation of aliases ``M``, ``Q``, ``Y``, etc. in favour of ``ME``, ``QE``, ``YE``, etc. for offsets
285+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
286+
287+
Renamed the following offset aliases (:issue:`57986`):
288+
289+
+-------------------------------+------------------+------------------+
290+
| offset | removed alias | new alias |
291+
+===============================+==================+==================+
292+
|:class:`MonthEnd` | ``M`` | ``ME`` |
293+
+-------------------------------+------------------+------------------+
294+
|:class:`BusinessMonthEnd` | ``BM`` | ``BME`` |
295+
+-------------------------------+------------------+------------------+
296+
|:class:`SemiMonthEnd` | ``SM`` | ``SME`` |
297+
+-------------------------------+------------------+------------------+
298+
|:class:`CustomBusinessMonthEnd`| ``CBM`` | ``CBME`` |
299+
+-------------------------------+------------------+------------------+
300+
|:class:`QuarterEnd` | ``Q`` | ``QE`` |
301+
+-------------------------------+------------------+------------------+
302+
|:class:`BQuarterEnd` | ``BQ`` | ``BQE`` |
303+
+-------------------------------+------------------+------------------+
304+
|:class:`YearEnd` | ``Y`` | ``YE`` |
305+
+-------------------------------+------------------+------------------+
306+
|:class:`BYearEnd` | ``BY`` | ``BYE`` |
307+
+-------------------------------+------------------+------------------+
308+
309+
Other Removals
310+
^^^^^^^^^^^^^^
282311
- :class:`.DataFrameGroupBy.idxmin`, :class:`.DataFrameGroupBy.idxmax`, :class:`.SeriesGroupBy.idxmin`, and :class:`.SeriesGroupBy.idxmax` will now raise a ``ValueError`` when used with ``skipna=False`` and an NA value is encountered (:issue:`10694`)
283312
- :func:`concat` no longer ignores empty objects when determining output dtypes (:issue:`39122`)
284313
- :func:`concat` with all-NA entries no longer ignores the dtype of those entries when determining the result dtype (:issue:`40893`)
@@ -342,7 +371,7 @@ Removal of prior version deprecations/changes
342371
- Enforced deprecation of string ``A`` denoting frequency in :class:`YearEnd` and strings ``A-DEC``, ``A-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`57699`)
343372
- Enforced deprecation of string ``BAS`` denoting frequency in :class:`BYearBegin` and strings ``BAS-DEC``, ``BAS-JAN``, etc. denoting annual frequencies with various fiscal year starts (:issue:`57793`)
344373
- Enforced deprecation of string ``BA`` denoting frequency in :class:`BYearEnd` and strings ``BA-DEC``, ``BA-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`57793`)
345-
- Enforced deprecation of strings ``T``, ``L``, ``U``, and ``N`` denoting frequencies in :class:`Minute`, :class:`Second`, :class:`Milli`, :class:`Micro`, :class:`Nano` (:issue:`57627`)
374+
- Enforced deprecation of strings ``T``, ``L``, ``U``, and ``N`` denoting frequencies in :class:`Minute`, :class:`Milli`, :class:`Micro`, :class:`Nano` (:issue:`57627`)
346375
- Enforced deprecation of strings ``T``, ``L``, ``U``, and ``N`` denoting units in :class:`Timedelta` (:issue:`57627`)
347376
- Enforced deprecation of the behavior of :func:`concat` when ``len(keys) != len(objs)`` would truncate to the shorter of the two. Now this raises a ``ValueError`` (:issue:`43485`)
348377
- Enforced deprecation of values "pad", "ffill", "bfill", and "backfill" for :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` (:issue:`57869`)
@@ -452,10 +481,12 @@ Categorical
452481

453482
Datetimelike
454483
^^^^^^^^^^^^
484+
- Bug in :attr:`is_year_start` where a DateTimeIndex constructed via a date_range with frequency 'MS' wouldn't have the correct year or quarter start attributes (:issue:`57377`)
455485
- Bug in :class:`Timestamp` constructor failing to raise when ``tz=None`` is explicitly specified in conjunction with timezone-aware ``tzinfo`` or data (:issue:`48688`)
456486
- Bug in :func:`date_range` where the last valid timestamp would sometimes not be produced (:issue:`56134`)
457487
- Bug in :func:`date_range` where using a negative frequency value would not include all points between the start and end values (:issue:`56382`)
458488
- Bug in :func:`tseries.api.guess_datetime_format` would fail to infer time format when "%Y" == "%H%M" (:issue:`57452`)
489+
- Bug in :meth:`Dataframe.agg` with df with missing values resulting in IndexError (:issue:`58810`)
459490
- Bug in :meth:`DatetimeIndex.is_year_start` and :meth:`DatetimeIndex.is_quarter_start` does not raise on Custom business days frequencies bigger then "1C" (:issue:`58664`)
460491
- Bug in :meth:`DatetimeIndex.is_year_start` and :meth:`DatetimeIndex.is_quarter_start` returning ``False`` on double-digit frequencies (:issue:`58523`)
461492
- Bug in setting scalar values with mismatched resolution into arrays with non-nanosecond ``datetime64``, ``timedelta64`` or :class:`DatetimeTZDtype` incorrectly truncating those scalars (:issue:`56410`)
@@ -515,6 +546,7 @@ I/O
515546
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
516547
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
517548
- Bug in :meth:`read_csv` raising ``TypeError`` when ``index_col`` is specified and ``na_values`` is a dict containing the key ``None``. (:issue:`57547`)
549+
- Bug in :meth:`read_stata` raising ``KeyError`` when input file is stored in big-endian format and contains strL data. (:issue:`58638`)
518550

519551
Period
520552
^^^^^^

pandas/_libs/tslibs/dtypes.pxd

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ cdef NPY_DATETIMEUNIT get_supported_reso(NPY_DATETIMEUNIT reso)
1212
cdef bint is_supported_unit(NPY_DATETIMEUNIT reso)
1313

1414
cdef dict c_OFFSET_TO_PERIOD_FREQSTR
15-
cdef dict c_OFFSET_DEPR_FREQSTR
16-
cdef dict c_REVERSE_OFFSET_DEPR_FREQSTR
15+
cdef dict c_PERIOD_TO_OFFSET_FREQSTR
16+
cdef dict c_OFFSET_RENAMED_FREQSTR
1717
cdef dict c_DEPR_ABBREVS
18+
cdef dict c_PERIOD_AND_OFFSET_DEPR_FREQSTR
1819
cdef dict attrname_to_abbrevs
1920
cdef dict npy_unit_to_attrname
2021
cdef dict attrname_to_npy_unit

pandas/_libs/tslibs/dtypes.pyx

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,10 @@ OFFSET_TO_PERIOD_FREQSTR: dict = {
176176
"EOM": "M",
177177
"BME": "M",
178178
"SME": "M",
179+
"BMS": "M",
180+
"CBME": "M",
181+
"CBMS": "M",
182+
"SMS": "M",
179183
"BQS": "Q",
180184
"QS": "Q",
181185
"BQE": "Q",
@@ -228,7 +232,6 @@ OFFSET_TO_PERIOD_FREQSTR: dict = {
228232
"YE-NOV": "Y-NOV",
229233
"W": "W",
230234
"ME": "M",
231-
"Y": "Y",
232235
"BYE": "Y",
233236
"BYE-DEC": "Y-DEC",
234237
"BYE-JAN": "Y-JAN",
@@ -245,7 +248,7 @@ OFFSET_TO_PERIOD_FREQSTR: dict = {
245248
"YS": "Y",
246249
"BYS": "Y",
247250
}
248-
cdef dict c_OFFSET_DEPR_FREQSTR = {
251+
cdef dict c_OFFSET_RENAMED_FREQSTR = {
249252
"M": "ME",
250253
"Q": "QE",
251254
"Q-DEC": "QE-DEC",
@@ -303,10 +306,37 @@ cdef dict c_OFFSET_DEPR_FREQSTR = {
303306
"BQ-OCT": "BQE-OCT",
304307
"BQ-NOV": "BQE-NOV",
305308
}
306-
cdef dict c_OFFSET_TO_PERIOD_FREQSTR = OFFSET_TO_PERIOD_FREQSTR
307-
cdef dict c_REVERSE_OFFSET_DEPR_FREQSTR = {
308-
v: k for k, v in c_OFFSET_DEPR_FREQSTR.items()
309+
PERIOD_TO_OFFSET_FREQSTR = {
310+
"M": "ME",
311+
"Q": "QE",
312+
"Q-DEC": "QE-DEC",
313+
"Q-JAN": "QE-JAN",
314+
"Q-FEB": "QE-FEB",
315+
"Q-MAR": "QE-MAR",
316+
"Q-APR": "QE-APR",
317+
"Q-MAY": "QE-MAY",
318+
"Q-JUN": "QE-JUN",
319+
"Q-JUL": "QE-JUL",
320+
"Q-AUG": "QE-AUG",
321+
"Q-SEP": "QE-SEP",
322+
"Q-OCT": "QE-OCT",
323+
"Q-NOV": "QE-NOV",
324+
"Y": "YE",
325+
"Y-DEC": "YE-DEC",
326+
"Y-JAN": "YE-JAN",
327+
"Y-FEB": "YE-FEB",
328+
"Y-MAR": "YE-MAR",
329+
"Y-APR": "YE-APR",
330+
"Y-MAY": "YE-MAY",
331+
"Y-JUN": "YE-JUN",
332+
"Y-JUL": "YE-JUL",
333+
"Y-AUG": "YE-AUG",
334+
"Y-SEP": "YE-SEP",
335+
"Y-OCT": "YE-OCT",
336+
"Y-NOV": "YE-NOV",
309337
}
338+
cdef dict c_OFFSET_TO_PERIOD_FREQSTR = OFFSET_TO_PERIOD_FREQSTR
339+
cdef dict c_PERIOD_TO_OFFSET_FREQSTR = PERIOD_TO_OFFSET_FREQSTR
310340

311341
# Map deprecated resolution abbreviations to correct resolution abbreviations
312342
cdef dict c_DEPR_ABBREVS = {
@@ -316,6 +346,11 @@ cdef dict c_DEPR_ABBREVS = {
316346
"S": "s",
317347
}
318348

349+
cdef dict c_PERIOD_AND_OFFSET_DEPR_FREQSTR = {
350+
"w": "W",
351+
"MIN": "min",
352+
}
353+
319354

320355
class FreqGroup(Enum):
321356
# Mirrors c_FreqGroup in the .pxd file

pandas/_libs/tslibs/fields.pyx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,9 +253,10 @@ def get_start_end_field(
253253
# month of year. Other offsets use month, startingMonth as ending
254254
# month of year.
255255

256-
if freq_name.lstrip("B")[0:2] in ["MS", "QS", "YS"]:
256+
if freq_name.lstrip("B")[0:2] in ["QS", "YS"]:
257257
end_month = 12 if month_kw == 1 else month_kw - 1
258258
start_month = month_kw
259+
259260
else:
260261
end_month = month_kw
261262
start_month = (end_month % 12) + 1

0 commit comments

Comments
 (0)