Skip to content

Floating point accuracy problems in DatetimeIndex.round #14440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eoincondron opened this issue Oct 17, 2016 · 1 comment
Closed

Floating point accuracy problems in DatetimeIndex.round #14440

eoincondron opened this issue Oct 17, 2016 · 1 comment
Labels
Bug Datetime Datetime data dtype
Milestone

Comments

@eoincondron
Copy link

A small, complete example of the issue

There is a slight problem when using the rounding methods of DatetimeIndex (round, floor, ceil) to high frequencies as illustrated by this example:

pd.DatetimeIndex(['2016-10-17 12:00:00.0015']).round('ms')
DatetimeIndex(['2016-10-17 12:00:00.001999872'], dtype='datetime64[ns]', freq=None)

The problem is here in the TimelikeOps._round method:

 result = (unit * rounder(values / float(unit))).astype('i8')

rounder(values / float(unit)) returns an array of floats containing the multiples of unit required. However, although the values look like ints, when multiplied by unit the result can be off due to floating point accuracy. Replacing it with

 result = (unit * rounder(values / float(unit)).astype('i8'))

Should fix the problem. I'm willing to do a PR to fix it.

Output of pd.show_versions()

pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 27.2.0
Cython: None
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.6.None
psycopg2: None
jinja2: 2.8
boto: None

@chris-b1 chris-b1 added Bug Datetime Datetime data dtype labels Oct 19, 2016
@chris-b1 chris-b1 added this to the Next Major Release milestone Oct 19, 2016
@chris-b1
Copy link
Contributor

Thanks for the report - if you want to submit a PR with that fix that would be great.

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 4, 2017
@jreback jreback closed this as completed in 5067708 Mar 5, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
closes pandas-dev#14440

Employs @eoincondron's fix for float point inaccuracies when rounding
by milliseconds for `DatetimeIndex.round` and `Timestamp.round`

Author: Matt Roeschke <[email protected]>

Closes pandas-dev#15568 from mroeschke/fix_14440 and squashes the following commits:

c5a7cbc [Matt Roeschke] BUG:Floating point accuracy with DatetimeIndex.round (pandas-dev#14440)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants