Skip to content

pd.Period off by hundred of years for dates past ~2263 #13346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eyurtsev opened this issue Jun 1, 2016 · 13 comments · Fixed by #34755
Closed

pd.Period off by hundred of years for dates past ~2263 #13346

eyurtsev opened this issue Jun 1, 2016 · 13 comments · Fixed by #34755
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Period Period data type
Milestone

Comments

@eyurtsev
Copy link

eyurtsev commented Jun 1, 2016

In [1]: import pandas as pd

In [2]: pd.Period(year=2300, month=1, day=1, freq='M').end_time
Out[2]: Timestamp('1715-07-14 00:25:26.290448383')

Input year is '2300', output shows '1715'

output of pd.show_versions()

In [3]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-25-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 7.1.2
setuptools: 21.0.0
Cython: None
numpy: 1.11.0
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: None
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Jun 1, 2016

http://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations

there are limitations on the repr of Timestamps.

@jreback jreback closed this as completed Jun 1, 2016
@jreback jreback added Datetime Datetime data dtype Usage Question Period Period data type labels Jun 1, 2016
@jreback
Copy link
Contributor

jreback commented Jun 1, 2016

you can represent as Periods that is in fact the point. But conversions are out-of-bounds.

@jreback
Copy link
Contributor

jreback commented Jun 1, 2016

I suppose this should actually raise.

@jreback jreback reopened this Jun 1, 2016
@jreback jreback added Error Reporting Incorrect or improved errors from pandas Difficulty Novice and removed Usage Question labels Jun 1, 2016
@jreback jreback added this to the Next Major Release milestone Jun 1, 2016
@jreback
Copy link
Contributor

jreback commented Jun 1, 2016

cc @MaximilianR
@sinhrks

@eyurtsev
Copy link
Author

eyurtsev commented Jun 1, 2016

+1 for raising an Exception

Is another way to get the year-month-day corresponding to end_time/start_time of the given period?

@jreback
Copy link
Contributor

jreback commented Jun 1, 2016

actually this should catch the oob error and just return a datetime

@eyurtsev
Copy link
Author

eyurtsev commented Jun 2, 2016

Don't have enough context about how pd.Period is used throughout pandas...

But having the return type change based on a non-obvious criteria could lead to problems as well. If possible would be nice to get a consistent type out (datetime).

@jreback
Copy link
Contributor

jreback commented Jun 2, 2016

Timestamp is a subclass of datetime
you need to be cognizant of the limits

@sinhrks
Copy link
Member

sinhrks commented Jun 4, 2016

Currently offsets coerces to normal datetime. As Period can be used as a workaround for Timestamp limitation, I prefer returning normal datetime when overflow.

pd.Timestamp('2011-01-01') + pd.offsets.YearEnd(1000) 
# datetime.datetime(3010, 12, 31, 0, 0)

@gliptak
Copy link
Contributor

gliptak commented Jul 1, 2016

Could Period be switched to datetime64? #179

@eyurtsev
Copy link
Author

bump

@jreback
Copy link
Contributor

jreback commented Feb 18, 2017

bump

you are welcome to contribute a PR to fix.

@ShaharNaveh
Copy link
Member

On master

In [1]: import pandas as pd                                                                                   

In [2]: pd.Period(year=2300, month=1, day=1, freq='M').end_time 

Raises the following:

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2300-02-01 00:00:00

Full traceback:

---------------------------------------------------------------------------
OutOfBoundsDatetime                       Traceback (most recent call last)
<ipython-input-2-47b0a5c0011f> in <module>
----> 1 pd.Period(year=2300, month=1, day=1, freq='M').end_time

~/Documents/Github/Community/Python/Projects/pandas-MomIsBestFriend/pandas/_libs/tslibs/period.pyx in pandas._libs.tslibs.period._Period.end_time.__get__()
   1765         # freq.n can't be negative or 0
   1766         # ordinal = (self + self.freq.n).start_time.value - 1
-> 1767         ordinal = (self + self.freq).start_time.value - 1
   1768         return Timestamp(ordinal)
   1769 

~/Documents/Github/Community/Python/Projects/pandas-MomIsBestFriend/pandas/_libs/tslibs/period.pyx in pandas._libs.tslibs.period._Period.start_time.__get__()
   1759         Timestamp('2012-01-01 23:59:59.999999999')
   1760         """
-> 1761         return self.to_timestamp(how='S')
   1762 
   1763     @property

~/Documents/Github/Community/Python/Projects/pandas-MomIsBestFriend/pandas/_libs/tslibs/period.pyx in pandas._libs.tslibs.period._Period.to_timestamp()
   1804         val = self.asfreq(freq, how)
   1805 
-> 1806         dt64 = period_ordinal_to_dt64(val.ordinal, base)
   1807         return Timestamp(dt64, tz=tz)
   1808 

~/Documents/Github/Community/Python/Projects/pandas-MomIsBestFriend/pandas/_libs/tslibs/period.pyx in pandas._libs.tslibs.period.period_ordinal_to_dt64()
   1184         get_date_info(ordinal, freq, &dts)
   1185 
-> 1186     check_dts_bounds(&dts)
   1187     return dtstruct_to_dt64(&dts)
   1188 

~/Documents/Github/Community/Python/Projects/pandas-MomIsBestFriend/pandas/_libs/tslibs/np_datetime.pyx in pandas._libs.tslibs.np_datetime.check_dts_bounds()
    118         fmt = (f'{dts.year}-{dts.month:02d}-{dts.day:02d} '
    119                f'{dts.hour:02d}:{dts.min:02d}:{dts.sec:02d}')
--> 120         raise OutOfBoundsDatetime(f'Out of bounds nanosecond timestamp: {fmt}')
    121 
    122 

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2300-02-01 00:00:00

Installed versions:

INSTALLED VERSIONS

commit : 4b64f98
python : 3.7.6.final.0
python-bits : 64
OS : Linux
OS-release : 5.5.11.a-1-hardened
Version : #1 SMP PREEMPT Sat, 21 Mar 2020 14:22:59 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.1.0.dev0+948.g4b64f9809
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.0.0.post20200311
Cython : 0.29.15
pytest : 5.4.1
hypothesis : 5.7.1
sphinx : 2.4.4
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.2
fastparquet : 0.3.3
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.1
pandas_gbq : None
pyarrow : 0.16.0
pytables : None
pyxlsb : None
s3fs : 0.4.0
scipy : 1.4.1
sqlalchemy : 1.3.15
tables : 3.6.1
tabulate : 0.8.7
xarray : 0.15.0
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.48.0


@jreback Should we add a regression test for this?

@mroeschke mroeschke removed Error Reporting Incorrect or improved errors from pandas Period Period data type Datetime Datetime data dtype labels Mar 31, 2020
@mroeschke mroeschke added the Needs Tests Unit test(s) needed to prevent regressions label Mar 31, 2020
OlivierLuG pushed a commit to OlivierLuG/pandas that referenced this issue Jun 13, 2020
@jreback jreback modified the milestones: Contributions Welcome, 1.1 Jun 14, 2020
@jreback jreback added the Period Period data type label Jun 14, 2020
@TomAugspurger TomAugspurger modified the milestones: 1.1, Contributions Welcome Jul 6, 2020
OlivierLuG pushed a commit to OlivierLuG/pandas that referenced this issue Jul 6, 2020
OlivierLuG pushed a commit to OlivierLuG/pandas that referenced this issue Jul 7, 2020
@jreback jreback modified the milestones: Contributions Welcome, 1.2 Oct 2, 2020
mroeschke pushed a commit that referenced this issue Oct 5, 2020
* #TST #13346 added tests

* TST #13346 taken review into account

* Added tests for #13346 - with review
jbrockmendel pushed a commit to jbrockmendel/pandas that referenced this issue Oct 13, 2020
* #TST pandas-dev#13346 added tests

* TST pandas-dev#13346 taken review into account

* Added tests for pandas-dev#13346 - with review
kesmit13 pushed a commit to kesmit13/pandas that referenced this issue Nov 2, 2020
* #TST pandas-dev#13346 added tests

* TST pandas-dev#13346 taken review into account

* Added tests for pandas-dev#13346 - with review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants