Skip to content

Fix pandas.Timedelta range #12728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

has2k1
Copy link
Contributor

@has2k1 has2k1 commented Mar 29, 2016

Problem
Pandas Timedelta derives from datetime.timedelta and increase
the resolution of the timedeltas. As such the Pandas.Timedelta
object can only have a smaller range of values.

Solution
This change modifies the properties that report
the range and resolution to reflect Pandas capabilities.

Reference
https://github.com/python/cpython/blob/8d1d7e6816753248768e4cc1c0370221814e9cf1/Lib/datetime.py#L651-L654


# Resolution is in nanoseconds
# (2**63)/(1*24*60*60*(10**9)) = 106751.991167
Timedelta.min = Timedelta(-106751.991167, 'D')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we specify this in ns instead of fractional days, which might have rounding errors? e.g., Timedelta(np.iinfo(np.int64).min, 'ns')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looks like you need min + 1 since min corresponds to NaT:

In [6]: Timedelta(np.iinfo(np.int64).min, 'ns')
Out[6]: NaT

In [7]: Timedelta(np.iinfo(np.int64).min + 1, 'ns')
Out[7]: Timedelta('-106752 days +00:12:43.145224')

In [8]: Timedelta(np.iinfo(np.int64).max, 'ns')
Out[8]: Timedelta('106751 days 23:47:16.854775')

@has2k1 has2k1 force-pushed the fix-timedelta-limits branch 2 times, most recently from af5fdff to 94b886b Compare March 29, 2016 09:49
@has2k1
Copy link
Contributor Author

has2k1 commented Mar 29, 2016

This concerns the 2nd commit, in this PR

@jreback, can you confirm that some of the tests at rely on buggy output.

Specifically, that result (slicing in the snippet below) erroneously includes the end day.

rng = timedelta_range('1 day 10:11:12', freq='h', periods=500)
s = Series(np.arange(len(rng)), index=rng)

result = s['5 day':'6 day']
expected = s.iloc[86:110]
assert_series_equal(result, expected)

If this is a second bug fix, then I will address the changes documentation. Maybe there is even a better way other than peaking at the indices and creating the expected slices.

@@ -1680,7 +1680,7 @@ def test_partial_slice(self):
s = Series(np.arange(len(rng)), index=rng)

result = s['5 day':'6 day']
expected = s.iloc[86:134]
expected = s.iloc[86:110]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you changing this?

@jreback jreback added Timedelta Timedelta data type Compat pandas objects compatability with Numpy or Python functions labels Mar 29, 2016
@has2k1 has2k1 force-pushed the fix-timedelta-limits branch from 94b886b to 99f3ba9 Compare March 29, 2016 23:38
@has2k1 has2k1 changed the title pandas.Timedelta gives bad range & resolution info Fix pandas.Timedelta range Mar 30, 2016
@@ -109,6 +109,31 @@ The ``unit`` keyword argument specifies the unit of the Timedelta:
to_timedelta(np.arange(5), unit='s')
to_timedelta(np.arange(5), unit='d')

timdelta limits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timedelta Limits

@jreback
Copy link
Contributor

jreback commented Mar 30, 2016

pls add a whatsnew note, point to the new docs (with a :ref:). Pls also move the datetime limits section to timeseries.rst as well.

thanks.

@jreback jreback added this to the 0.18.1 milestone Mar 30, 2016
has2k1 added 2 commits March 31, 2016 03:38
*Problem*
Pandas Timedelta derives from `datetime.timedelta` and increases
the resolution of the timedeltas to nanoseconds. As such
Pandas.Timedelta has a smaller range of values.

*Solution*
This change modifies the advertised `min` and `max` timedeltas.
@has2k1 has2k1 force-pushed the fix-timedelta-limits branch from 99f3ba9 to 2b78e5a Compare March 31, 2016 09:02
@jreback jreback closed this in 0d58446 Mar 31, 2016
@jreback
Copy link
Contributor

jreback commented Mar 31, 2016

thanks @has2k1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pd.timedelta has smaller than expected range
3 participants