Skip to content

handle DST appropriately in Timestamp.replace #18618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jan 5, 2018
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -378,3 +378,4 @@ Other
^^^^^

- Improved error message when attempting to use a Python keyword as an identifier in a ``numexpr`` backed query (:issue:`18221`)
- :func:`Timestamp.replace` will now handle Daylight Savings transitions gracefully (:issue:`18319`)
16 changes: 13 additions & 3 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ from np_datetime cimport (reverse_ops, cmp_scalar, check_dts_bounds,
is_leapyear)
from timedeltas import Timedelta
from timedeltas cimport delta_to_nanoseconds
from timezones cimport get_timezone, is_utc, maybe_get_tz
from timezones cimport get_timezone, is_utc, maybe_get_tz, treat_tz_as_pytz

# ----------------------------------------------------------------------
# Constants
Expand Down Expand Up @@ -922,8 +922,18 @@ class Timestamp(_Timestamp):
_tzinfo = tzinfo

# reconstruct & check bounds
ts_input = datetime(dts.year, dts.month, dts.day, dts.hour, dts.min,
dts.sec, dts.us, tzinfo=_tzinfo)
if _tzinfo is not None and treat_tz_as_pytz(_tzinfo):
# replacing across a DST boundary may induce a new tzinfo object
# see GH#18319
ts_input = _tzinfo.localize(datetime(dts.year, dts.month, dts.day,
dts.hour, dts.min, dts.sec,
dts.us))
_tzinfo = ts_input.tzinfo
else:
ts_input = datetime(dts.year, dts.month, dts.day,
dts.hour, dts.min, dts.sec, dts.us,
tzinfo=_tzinfo)

ts = convert_datetime_to_tsobject(ts_input, _tzinfo)
value = ts.value + (dts.ps // 1000)
if value != NPY_NAT:
Expand Down
21 changes: 21 additions & 0 deletions pandas/tests/tseries/test_timezones.py
Original file line number Diff line number Diff line change
Expand Up @@ -1228,6 +1228,27 @@ def f():
dt = Timestamp('2013-11-03 01:59:59.999999-0400', tz='US/Eastern')
assert dt.tz_localize(None) == dt.replace(tzinfo=None)

def test_replace_across_dst(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you tests with dateutil as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is pretty specific to pytz API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is specific to the pytz API because the problem is specific to the pytz API, so you can't write tests that currently fail with dateutil, but you can, in fact, write this test to be agnostic as to whether the test takes a pytz or dateutil zone.

Rather than using localize and normalize, express the "expected" datetimes as UTC datetimes and use .astimezone. pytz and dateutil support astimezone (and, in fact, pytz.normalize essentially just uses .astimezone under the hood). Since pandas provides their own API layer on top of datetime, it is also agnostic to the timezone provider.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using localize and normalize, express the "expected" datetimes as UTC datetimes and use .astimezone

Is this a suggestion for this test or more generally?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a suggestion for this test or more generally?

In this test at least, since with tests you can always engineer your datetime literals such that you never have to use anything except astimezone.

It's also just a useful piece of information to have, that astimezone is one of the few timezone functions where pytz behaves more or less nicely. You only need normalize semantics with pytz for things like replace or "calendar" arithmetic, where you want to change the naive portion of the datetime to a specific value and then change the time zone offset to match. If you want "absolute" arithmetic (where you want to go forward a certain number of hours or seconds or something), then the_operation(dt.astimezone(UTC)).astimezone(dt.tzinfo) will always give the right answer.

In this case you know what the correct answer should be, in absolute time, so you can just declare your initial variable as the correct answer in UTC and convert that to the time zone you care about. If you do it generically like that, you are insulated from any weird quirks of pytz's interface, and you get dateutil support for free (so you can parametrize this test using pytz and dateutil zones).

Of course, this method will fail if you try to construct an imaginary time, since there is no mapping between UTC and imaginary times.

# GH#18319 check that 1) timezone is correctly normalized and
# 2) that hour is not incorrectly changed by this normalization
tz = pytz.timezone('US/Eastern')

ts_naive = Timestamp('2017-12-03 16:03:30')
ts_aware = tz.localize(ts_naive)

# Preliminary sanity-check
assert ts_aware == ts_aware.tzinfo.normalize(ts_aware)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to parametrize this across dateutil & pytz, so you need to move this to TestTimeZoneSupportPytz and use self.tz(...)

# Replace across DST boundary
ts2 = ts_aware.replace(month=6)

# Check that `replace` preserves hour literal
assert (ts2.hour, ts2.minute) == (ts_aware.hour, ts_aware.minute)

# Check that post-replace object is appropriately normalized
ts2b = ts2.tzinfo.normalize(ts2)
assert ts2 == ts2b

def test_ambiguous_compat(self):
# validate that pytz and dateutil are compat for dst
# when the transition happens
Expand Down