Skip to content

Commit 1bd1830

Browse files
authored
API: offsets.Day is always calendar-day (#61985)
1 parent 712b2f6 commit 1bd1830

30 files changed

+350
-93
lines changed

ci/code_checks.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,7 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
163163
-i "pandas.tseries.offsets.DateOffset.is_on_offset GL08" \
164164
-i "pandas.tseries.offsets.DateOffset.n GL08" \
165165
-i "pandas.tseries.offsets.DateOffset.normalize GL08" \
166+
-i "pandas.tseries.offsets.Day.freqstr SA01" \
166167
-i "pandas.tseries.offsets.Day.is_on_offset GL08" \
167168
-i "pandas.tseries.offsets.Day.n GL08" \
168169
-i "pandas.tseries.offsets.Day.normalize GL08" \

doc/source/user_guide/timedeltas.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ You can construct a ``Timedelta`` scalar through various arguments, including `I
5353
pd.Timedelta("P0DT0H1M0S")
5454
pd.Timedelta("P0DT0H0M0.000000123S")
5555
56-
:ref:`DateOffsets<timeseries.offsets>` (``Day, Hour, Minute, Second, Milli, Micro, Nano``) can also be used in construction.
56+
:ref:`DateOffsets<timeseries.offsets>` (``Hour, Minute, Second, Milli, Micro, Nano``) can also be used in construction.
5757

5858
.. ipython:: python
5959
@@ -63,7 +63,7 @@ Further, operations among the scalars yield another scalar ``Timedelta``.
6363

6464
.. ipython:: python
6565
66-
pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) + pd.Timedelta(
66+
pd.Timedelta(pd.offsets.Hour(48)) + pd.Timedelta(pd.offsets.Second(2)) + pd.Timedelta(
6767
"00:00:00.000123"
6868
)
6969

doc/source/user_guide/timeseries.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -903,7 +903,7 @@ into ``freq`` keyword arguments. The available date offsets and associated frequ
903903
:class:`~pandas.tseries.offsets.Easter`, None, "Easter holiday"
904904
:class:`~pandas.tseries.offsets.BusinessHour`, ``'bh'``, "business hour"
905905
:class:`~pandas.tseries.offsets.CustomBusinessHour`, ``'cbh'``, "custom business hour"
906-
:class:`~pandas.tseries.offsets.Day`, ``'D'``, "one absolute day"
906+
:class:`~pandas.tseries.offsets.Day`, ``'D'``, "one calendar day"
907907
:class:`~pandas.tseries.offsets.Hour`, ``'h'``, "one hour"
908908
:class:`~pandas.tseries.offsets.Minute`, ``'min'``,"one minute"
909909
:class:`~pandas.tseries.offsets.Second`, ``'s'``, "one second"
@@ -1000,7 +1000,7 @@ apply the offset to each element.
10001000
s + pd.DateOffset(months=2)
10011001
s - pd.DateOffset(months=2)
10021002
1003-
If the offset class maps directly to a ``Timedelta`` (``Day``, ``Hour``,
1003+
If the offset class maps directly to a ``Timedelta`` (``Hour``,
10041004
``Minute``, ``Second``, ``Micro``, ``Milli``, ``Nano``) it can be
10051005
used exactly like a ``Timedelta`` - see the
10061006
:ref:`Timedelta section<timedeltas.operations>` for more examples.

doc/source/whatsnew/v3.0.0.rst

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,41 @@ This change also applies to :meth:`.DataFrameGroupBy.value_counts`. Here, there
300300
301301
df.groupby("a", sort=True).value_counts(sort=False)
302302
303+
.. _whatsnew_300.api_breaking.offsets_day_not_a_tick:
304+
305+
Changed behavior of ``pd.offsets.Day`` to always represent calendar-day
306+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
307+
308+
In previous versions of pandas, :class:`offsets.Day` represented a fixed span
309+
of 24 hours, disregarding Daylight Savings Time transitions. It now consistently
310+
behaves as a calendar-day, preserving time-of-day across DST transitions:
311+
312+
*Old behavior*
313+
314+
.. code-block:: ipython
315+
316+
In [5]: ts = pd.Timestamp("2025-03-08 08:00", tz="US/Eastern")
317+
In [6]: ts + pd.offsets.Day(1)
318+
Out[3]: Timestamp('2025-03-09 09:00:00-0400', tz='US/Eastern')
319+
320+
*New behavior*
321+
322+
.. ipython:: python
323+
324+
ts = pd.Timestamp("2025-03-08 08:00", tz="US/Eastern")
325+
ts + pd.offsets.Day(1)
326+
327+
This change fixes a long-standing bug in :func:`date_range` (:issue:`51716`, :issue:`35388`), but causes several
328+
small behavior differences as collateral:
329+
330+
- ``pd.offsets.Day(n)`` no longer compares as equal to ``pd.offsets.Hour(24*n)``
331+
- :class:`offsets.Day` no longer supports division
332+
- :class:`Timedelta` no longer accepts :class:`Day` objects as inputs
333+
- :meth:`tseries.frequencies.to_offset` on a :class:`Timedelta` object returns a :class:`offsets.Hour` object in cases where it used to return a :class:`Day` object.
334+
- Adding or subtracting a scalar from a timezone-aware :class:`DatetimeIndex` with a :class:`Day` ``freq`` no longer preserves that ``freq`` attribute.
335+
- Adding or subtracting a :class:`Day` with a :class:`Timedelta` is no longer supported.
336+
- Adding or subtracting a :class:`Day` offset to a timezone-aware :class:`Timestamp` or datetime-like may lead to an ambiguous or non-existent time, which will raise.
337+
303338
.. _whatsnew_300.api_breaking.deps:
304339

305340
Increased minimum version for Python

pandas/_libs/tslibs/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
__all__ = [
22
"BaseOffset",
3+
"Day",
34
"IncompatibleFrequency",
45
"NaT",
56
"NaTType",
@@ -61,6 +62,7 @@
6162
)
6263
from pandas._libs.tslibs.offsets import (
6364
BaseOffset,
65+
Day,
6466
Tick,
6567
to_offset,
6668
)

pandas/_libs/tslibs/offsets.pyi

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ class Tick(SingleConstructorOffset):
116116

117117
def delta_to_tick(delta: timedelta) -> Tick: ...
118118

119-
class Day(Tick): ...
119+
class Day(BaseOffset): ...
120120
class Hour(Tick): ...
121121
class Minute(Tick): ...
122122
class Second(Tick): ...

pandas/_libs/tslibs/offsets.pyx

Lines changed: 80 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1023,8 +1023,6 @@ cdef class Tick(SingleConstructorOffset):
10231023
# Note: Without making this cpdef, we get AttributeError when calling
10241024
# from __mul__
10251025
cpdef Tick _next_higher_resolution(Tick self):
1026-
if type(self) is Day:
1027-
return Hour(self.n * 24)
10281026
if type(self) is Hour:
10291027
return Minute(self.n * 60)
10301028
if type(self) is Minute:
@@ -1173,7 +1171,7 @@ cdef class Tick(SingleConstructorOffset):
11731171
self.normalize = False
11741172

11751173

1176-
cdef class Day(Tick):
1174+
cdef class Day(SingleConstructorOffset):
11771175
"""
11781176
Offset ``n`` days.
11791177
@@ -1203,11 +1201,73 @@ cdef class Day(Tick):
12031201
>>> ts + Day(-4)
12041202
Timestamp('2022-12-05 15:00:00')
12051203
"""
1204+
_adjust_dst = True
1205+
_attributes = tuple(["n", "normalize"])
12061206
_nanos_inc = 24 * 3600 * 1_000_000_000
12071207
_prefix = "D"
12081208
_period_dtype_code = PeriodDtypeCode.D
12091209
_creso = NPY_DATETIMEUNIT.NPY_FR_D
12101210

1211+
def __init__(self, n=1, normalize=False):
1212+
BaseOffset.__init__(self, n)
1213+
if normalize:
1214+
# GH#21427
1215+
raise ValueError(
1216+
"Day offset with `normalize=True` are not allowed."
1217+
)
1218+
1219+
def is_on_offset(self, dt) -> bool:
1220+
return True
1221+
1222+
@apply_wraps
1223+
def _apply(self, other):
1224+
if isinstance(other, Day):
1225+
# TODO: why isn't this handled in __add__?
1226+
return Day(self.n + other.n)
1227+
return other + np.timedelta64(self.n, "D")
1228+
1229+
def _apply_array(self, dtarr):
1230+
return dtarr + np.timedelta64(self.n, "D")
1231+
1232+
@cache_readonly
1233+
def freqstr(self) -> str:
1234+
"""
1235+
Return a string representing the frequency.
1236+
1237+
Examples
1238+
--------
1239+
>>> pd.Day(5).freqstr
1240+
'5D'
1241+
1242+
>>> pd.offsets.Day(1).freqstr
1243+
'D'
1244+
"""
1245+
if self.n != 1:
1246+
return str(self.n) + "D"
1247+
return "D"
1248+
1249+
# Having this here isn't strictly-correct post-GH#61985
1250+
# but this gets called in timedelta.get_unit_for_round in cases where
1251+
# Day unambiguously means 24h.
1252+
@property
1253+
def nanos(self) -> int64_t:
1254+
"""
1255+
Returns an integer of the total number of nanoseconds.
1256+
1257+
See Also
1258+
--------
1259+
tseries.offsets.Hour.nanos :
1260+
Returns an integer of the total number of nanoseconds.
1261+
tseries.offsets.Day.nanos :
1262+
Returns an integer of the total number of nanoseconds.
1263+
1264+
Examples
1265+
--------
1266+
>>> pd.offsets.Hour(5).nanos
1267+
18000000000000
1268+
"""
1269+
return self.n * self._nanos_inc
1270+
12111271

12121272
cdef class Hour(Tick):
12131273
"""
@@ -1431,16 +1491,13 @@ cdef class Nano(Tick):
14311491
def delta_to_tick(delta: timedelta) -> Tick:
14321492
if delta.microseconds == 0 and getattr(delta, "nanoseconds", 0) == 0:
14331493
# nanoseconds only for pd.Timedelta
1434-
if delta.seconds == 0:
1435-
return Day(delta.days)
1494+
seconds = delta.days * 86400 + delta.seconds
1495+
if seconds % 3600 == 0:
1496+
return Hour(seconds / 3600)
1497+
elif seconds % 60 == 0:
1498+
return Minute(seconds / 60)
14361499
else:
1437-
seconds = delta.days * 86400 + delta.seconds
1438-
if seconds % 3600 == 0:
1439-
return Hour(seconds / 3600)
1440-
elif seconds % 60 == 0:
1441-
return Minute(seconds / 60)
1442-
else:
1443-
return Second(seconds)
1500+
return Second(seconds)
14441501
else:
14451502
nanos = delta_to_nanoseconds(delta)
14461503
if nanos % 1_000_000 == 0:
@@ -5332,6 +5389,17 @@ cpdef to_offset(freq, bint is_period=False):
53325389
raise ValueError(INVALID_FREQ_ERR_MSG.format(
53335390
f"{freq}, failed to parse with error message: {repr(err)}")
53345391
) from err
5392+
5393+
# TODO(3.0?) once deprecation of "d" is enforced, the check for it here
5394+
# can be removed
5395+
if (
5396+
isinstance(result, Hour)
5397+
and result.n % 24 == 0
5398+
and ("d" in freq or "D" in freq)
5399+
):
5400+
# Since Day is no longer a Tick, delta_to_tick returns Hour above,
5401+
# so we convert back here.
5402+
result = Day(result.n // 24)
53355403
else:
53365404
result = None
53375405

pandas/_libs/tslibs/period.pyx

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ from pandas._libs.tslibs.offsets cimport (
113113
from pandas._libs.tslibs.offsets import (
114114
INVALID_FREQ_ERR_MSG,
115115
BDay,
116+
Day,
116117
)
117118
from pandas.util._decorators import set_module
118119

@@ -1825,6 +1826,10 @@ cdef class _Period(PeriodMixin):
18251826
# i.e. np.timedelta64("nat")
18261827
return NaT
18271828

1829+
if isinstance(other, Day):
1830+
# Periods are timezone-naive, so we treat Day as Tick-like
1831+
other = np.timedelta64(other.n, "D")
1832+
18281833
try:
18291834
inc = delta_to_nanoseconds(other, reso=self._dtype._creso, round_ok=False)
18301835
except ValueError as err:
@@ -1846,7 +1851,7 @@ cdef class _Period(PeriodMixin):
18461851

18471852
@cython.overflowcheck(True)
18481853
def __add__(self, other):
1849-
if is_any_td_scalar(other):
1854+
if is_any_td_scalar(other) or isinstance(other, Day):
18501855
return self._add_timedeltalike_scalar(other)
18511856
elif is_offset_object(other):
18521857
return self._add_offset(other)

pandas/_libs/tslibs/timedeltas.pyx

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ from pandas._libs.tslibs.np_datetime import (
7878
)
7979

8080
from pandas._libs.tslibs.offsets cimport is_tick_object
81+
from pandas._libs.tslibs.offsets import Day
8182
from pandas._libs.tslibs.util cimport (
8283
is_array,
8384
is_float_object,
@@ -2576,5 +2577,9 @@ cpdef int64_t get_unit_for_round(freq, NPY_DATETIMEUNIT creso) except? -1:
25762577
from pandas._libs.tslibs.offsets import to_offset
25772578

25782579
freq = to_offset(freq)
2579-
freq.nanos # raises on non-fixed freq
2580+
if isinstance(freq, Day):
2581+
# In the "round" context, Day unambiguously means 24h, not calendar-day
2582+
freq = Timedelta(days=freq.n)
2583+
else:
2584+
freq.nanos # raises on non-fixed freq
25802585
return delta_to_nanoseconds(freq, creso)

pandas/core/arrays/_ranges.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
from pandas._libs.lib import i8max
1313
from pandas._libs.tslibs import (
1414
BaseOffset,
15+
Day,
1516
OutOfBoundsDatetime,
1617
Timedelta,
1718
Timestamp,
@@ -55,8 +56,13 @@ def generate_regular_range(
5556
"""
5657
istart = start._value if start is not None else None
5758
iend = end._value if end is not None else None
58-
freq.nanos # raises if non-fixed frequency
59-
td = Timedelta(freq)
59+
if isinstance(freq, Day):
60+
# In contexts without a timezone, a Day offset is unambiguously
61+
# interpretable as Timedelta-like.
62+
td = Timedelta(days=freq.n)
63+
else:
64+
freq.nanos # raises if non-fixed frequency
65+
td = Timedelta(freq)
6066
b: int
6167
e: int
6268
try:

0 commit comments

Comments
 (0)