Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pandas/tseries/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,7 @@ def _generate(cls, start, end, periods, name, offset,
tz = tz.localize(date.replace(tzinfo=None)).tzinfo

if tz is not None and inferred_tz is not None:
if not inferred_tz == tz:
if not tslib.get_timezone(inferred_tz) == tslib.get_timezone(tz):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this should not be necessary; str comparison already does this

Copy link
Author

@j-santander j-santander Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get an exception if I don't... apparently the timezone with and without dst are not the same thing:

Traceback (most recent call last):
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\tests\test_resample.py", line 1923, in test_resample_across_dst
    df.resample(rule='H').sum(),
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\resample.py", line 541, in f
    return self._downsample(_method)
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\resample.py", line 672, in _downsample
    self._set_binner()
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\resample.py", line 238, in _set_binner
    self.binner, self.grouper = self._get_binner()
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\resample.py", line 246, in _get_binner
    binner, bins, binlabels = self._get_binner_for_time()
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\resample.py", line 661, in _get_binner_for_time
    return self.groupby._get_time_bins(self.ax)
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\resample.py", line 1130, in _get_time_bins
    name=ax.name)
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\util\decorators.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\index.py", line 301, in __new__
    ambiguous=ambiguous)
  File "C:\Users\jsantand\Code\video\be\pandas\pandas\tseries\index.py", line 444, in _generate
    raise AssertionError("Inferred time zone not equal to passed "
AssertionError: Inferred time zone not equal to passed time zone

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, can you repro that directly via a DatetimeIndex construction call?

does it break anything else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I've traced the issue to the tz of input index:

I do:

df1 = DataFrame([1477786980, 1477790580], columns=['ts'])
dti1 = DatetimeIndex(pd.to_datetime(df1.ts, unit='s').dt.tz_localize('UTC').dt.tz_convert('Europe/Madrid'))

Now, this dti1 index has a funny tz attribute: <DstTzInfo 'Europe/Paris' LMT+0:09:00 STD>
When resampling, the index tz is used in creating the new index and it is where the assert comes.

That probably means I don't need this change... but the fix is needed elsewhere.

Copy link
Author

@j-santander j-santander Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad.... To reproduce:

t1=Timestamp(2016,10,30,2,23).tz_localize('Europe/Madrid',ambiguous=True)
t2=Timestamp(2016,10,30,2,13).tz_localize('Europe/Madrid',ambiguous=False)
dti=DatetimeIndex(start=t1,end=t2,freq='H',tz='Europe/Madrid')

I've also found that the same check is done in pandas.tseries.tools.py:

def _infer_tzinfo(start, end):
    def _infer(a, b):
        tz = a.tzinfo
        if b and b.tzinfo:
            if not (tslib.get_timezone(tz) == tslib.get_timezone(b.tzinfo)):
                raise AssertionError('Inputs must both have the same timezone,'
                                     ' {0} != {1}'.format(tz, b.tzinfo))
        return tz

    tz = None
    if start is not None:
        tz = _infer(start, end)
    elif end is not None:
        tz = _infer(end, start)
    return tz

I'll submit now the changes and see if this time they are ok.

raise AssertionError("Inferred time zone not equal to passed "
"time zone")

Expand Down
13 changes: 8 additions & 5 deletions pandas/tseries/resample.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

import pandas as pd
from pandas.core.base import AbstractMethodError, GroupByMixin
from pandas.core.config_init import pc_ambiguous_as_wide_doc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you importing this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDE mischieve, I'll remove.


from pandas.core.groupby import (BinGrouper, Grouper, _GroupBy, GroupBy,
SeriesGroupBy, groupby, PanelGroupBy)
Expand Down Expand Up @@ -1284,8 +1285,13 @@ def _adjust_dates_anchored(first, last, offset, closed='right', base=0):
# See https://github.com/pandas-dev/pandas/issues/8683

first_tzinfo = first.tzinfo
last_tzinfo = last.tzinfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the issue number as a comment and a brief expl. future readers will appreciate!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will

first_dst = bool(first.dst())
last_dst = bool(last.dst())

first = first.tz_localize(None)
last = last.tz_localize(None)

start_day_nanos = first.normalize().value

base_nanos = (base % offset.n) * offset.nanos // offset.n
Expand Down Expand Up @@ -1320,11 +1326,8 @@ def _adjust_dates_anchored(first, last, offset, closed='right', base=0):
else:
lresult = last.value + offset.nanos

# return (Timestamp(fresult, tz=first.tz),
# Timestamp(lresult, tz=last.tz))

return (Timestamp(fresult).tz_localize(first_tzinfo),
Timestamp(lresult).tz_localize(first_tzinfo))
return (Timestamp(fresult).tz_localize(first_tzinfo, ambiguous=first_dst),
Timestamp(lresult).tz_localize(last_tzinfo, ambiguous=last_dst))


def asfreq(obj, freq, method=None, how=None, normalize=False):
Expand Down
13 changes: 12 additions & 1 deletion pandas/tseries/tests/test_resample.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -1912,7 +1912,18 @@ def test_resample_size(self):
right = Series(val, index=ix)
assert_series_equal(left, right)

def test_resmaple_dst_anchor(self):
def test_resample_across_dst(self):
#14682
df1 = DataFrame([1477786980, 1477790580], columns=['ts'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give a 1-liner about what you are testing here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question... where's the whatsnew file I need to add the description to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bug, so you can put in 0.19.2

df2 = DataFrame([1477785600, 1477789200], columns=['ts'])
dti1 = DatetimeIndex(pd.to_datetime(df1.ts, unit='s').dt.tz_localize('UTC').dt.tz_convert('Europe/Madrid'))
dti2 = DatetimeIndex(pd.to_datetime(df2.ts, unit='s').dt.tz_localize('UTC').dt.tz_convert('Europe/Madrid'))
df = DataFrame([5, 5], index=dti1)
assert_frame_equal(
df.resample(rule='H').sum(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do

result = df.resample(....)
expected = ....
assert_frame_equal(result, expected)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

DataFrame([5, 5], index=dti2))

def test_resample_dst_anchor(self):
# 5172
dti = DatetimeIndex([datetime(2012, 11, 4, 23)], tz='US/Eastern')
df = DataFrame([5], index=dti)
Expand Down